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© Secretory expression in eukaryotes. 

© Methods and compositions are provided for producing 
polypeptide sequences in high yield by employing DNA 
constructs, wherein the DNA sequence encoding for the 
polypeptide of Interest is preceded by a leader sequence and 
processing sequence for secreting and processing said 
polypeptide. In this manner, the mature polypeptide of 
interest may be isolated from the nutrient medium substan- 
tially free of major amounts of other proteins and cellular 
debris. 

The yeast strain SL cerevisiae AB103 (pYEGFB) was 
deposited on January 5, 1983, at the A.T.CC and given 
accession No. 2065a 

The plasmid pYaEGF23 (pAB114-pC1/1) was deposited 
at the A.T.C.C. on August 12, 1983, and given Accession No. 
40079. 
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9729-1-1/CCCCC2 
SECRETORY EXPRESSION IN EUKARYOTES 

BACKGROUND OF THE INVENTION 
Field of the Invention 
5 Hybrid DNA technology has revolutionized the 

ability to produce polypeptides q£ an infinite variety 
of compositions. Since living forms are composed of 
proteins and employ proteins for regulation, the 
ability to duplicate these proteins at will offers 

10 unique opportunities for investigating the manner in 
which these proteins function and the use of such 
proteins, fragments of such proteins, or analogs in 
therapy and diagnosis. 

There have.J>een numerous advances in improv- 

15 ing the rate and amount of protein produced by a cell. 
Most of these advances have been associated with higher 
copy numbers, more efficient promoters, and means for 
reducing the amount of degradation of the desired 
product. Is is evident that it would be extremely 

20 desirable to be able to secrete polypeptides of interest, 
where such polypeptides are the product of interest. 

Furthermore, in many situations, the polypep- 
tide of interest does not have an initial methionine 
amino acid. This is usually a result of there being a 

25 processing signal in the gene encoding for the polypep- 
tide of interest, which the gene source recognizes and 
cleaves with an appropriate peptidase. Since in most 
situations, genes of interest are heterologous to the 
host in which the gene is to be expressed, such proces- 

30 sing occurs imprecisely and in low yield in the expres- 
sion host. In this case, while the protein which is 
obtained will be identical to the peptide of interest 
for almost all of its sequence, it will differ at the 
N-terminus which can d^leteriously affect physiological 

35 activity. 
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There are, therefore, many reasons why it 
would be extremely advantageous to prepare DMA se- 
quences, which would encode for the secretion and 
maturing of the polypeptide product. Furthermore, 
5 where sequences can be found for processing, which 

result in the removal of amino acids superfluous to the 
polypeptide of interest, the opportunity exists for 
having a plurality of DMA sequences, either the same or 
different, in tandem, which may be encoded on a single 

10 transcript. 

Description of the Prior Art 

U.S. Patent No. 4,336,336 describes for pro- 
karyotes the use of a leader sequence coding for a non- 
cytoplasmic protein normally transported to or beyond 

15 the cell surface, resulting in transfer of the fused 
protein to the periplasmic space. U.S. Patent No. 
4,338,397 describes for prokaryotes using a leader 
sequence which provides for secretion with cleavage of 
the leader sequence from the polypeptide sequence of 

20 interest. U.S. Patent No. 4,338,397, columns 3 and 4, 
provide for useful definitions, which definitions are 
incorporated herein by reference. 

Kurjan and Herskowitz, Cell (1982) 30:933-943 
describes a putative a- factor precursor containing four 

25 tandem copies of mature a-factor, describing the 
sequence and postulating a processing mechanism. 
Kurjan and Herskowitz, Abstracts of Papefcs presented at 
the 1981 Cold Spring Harbor meeting ori The Molecular 
Biology of Yeasts, page 242, in an Abstract entitled, 

30 "A Putative o-Factor Precursor Containing Four Tandem 
Repeats of Mature o-Factor, u describe the sequence 
encoding for the or-factor and spacers between two of 
such sequences. Blair et al.. Abstracts of Papers, 
ibid , page 243, in an Abstract entitled n Synthesis and 

35 Processing of Yeast Pheremones: Identification and 
Characterization of Mutants That Produce Altered a- 
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Factors," describe the effect of various mutants on 
the production of mature a -factor. 

SUMMARY OF THE INVENTION 
Methods and compositions are provided for 
5 producing mature polypeptides. DNA constructs are 
provided which join the DNA fragments encoding for a 
yeast leader sequence and processing signal to heterolo- 
gous genes for secretion and maturation of the poly- 
peptide product. The construct of the DNA encoding for 
10 the N-terminal cleavable oligopeptide and the DNA 

sequence encoding for the mature polypeptide product 
can be joined to appropriate vectors for introduction 
into yeast or other cell which recognizes the processing 
•" " -signals for production of the desired polypeptide. 
15 Other capabilities may also be introduced into the 
construct for various purposes. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 is a flow diagram indicating the 
construction of pYorEGF-21. 
20 2 shows sequences at fusions of hEGF to 

the vector, a. through e. show the sequences at the 
N-terminal region of hEGF, which differ among several 
constructions and f . shows the C- terminal region of 
hEGF. 

25 DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

In accordance with the subject invention, 
eukaryotic hosts, particularly yeast are employed for 
the production of mature polypeptides where such 
polypeptides may be harvested from a nutrient medium. 

30 The polypeptides are produced by employing a DNA 
construct encoding for yeast leader and processing 
signals joined to a polypeptide of interest, which may 
be a single polypeptide or a plurality of polypeptides 
separated by processing signals. The resulting 
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construct: encodes for a pre-pro-polypeptide which will • 
contain the signals for .secretion of the pre-pro- 
polypeptide and processing of the polypeptide, either 
intracellularly or extracellularly to the mature 
5 polypeptide. 

The constructs of the subject invention will for example 
have at least the following formula defining a pro- 
polypeptide: 

( (R) r -(GAXYCX) n -Gene* ) 

10 wherein: 

R is CGX or AZZ, the codons coding for lysine 
and arginine, each of the Rs being the same or different; 

r is an integer of from 2 to 4, usually 2 to 
3, preferably 2 or 4; ' 
15 X is any of the four nucleotides, T, G, C, or 

A; 

Y is G or C; 

y is an integer of at least one and usually 
not more than 10 , more usually not more than four, 
20 providing for monomers and mul timers; 
Z is A or G; and 

Gene* is a gene other than o-f actor, usually 

foreign to a yeast host, usually a heterologous gene, 

desirably a plant or mammalian gene; 
25 n is 0 or an integer which will generally 

vary from 1 to 4, usually 2 to 3. 

The pro-polypeptide has an N-terminal proces- 

sing signal for peptidase removal of the amino acids 

preceding the amino acids encoded for by Gene*, 
30 For the most part, the constructs of the 

subject invention will have at least the following 

formula: 

L-(R-S-(GAXTCX) n )-Gene*) y 
defining a pre-pro-polypeptide, wherein all 
35 the symbols except L and S have been defined, S having 
the same definition as R, there being 1R and IS, and L 
is a leader sequence providing for secretion of the 
pre-pro-polypeptide. While it is feasible to have more 
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Bs and SB there will usually be no advantage in the 
additional amino acids. Any leader sequence may be 
employed which provides for secretion, leader sequences 
generally being of about 30 to 120 amino acids, usually 
about 30 to 100 amino acids, having a hydrophobic 
region and having a methionine at its N-termihus. ; 

The construct when n is 0 will have the 
following formula: 

L-((R) r ,-Gene*) y 
defining a pre-pro-polypeptide, wherein all the symbols 
have been defined previously, except r', wherein: ■ 

r' is 2 to 4, preferably 2 or 4. 

Of particular interest is the leader sequence 
of o-f actor which is described in Kurjan and Hersko- 
witz, supra, on page 937 or fragments or analogs 
thereof, which provide for efficient secretion of the 
desired polypeptides. Furthermore, the DNA sequence 
indicated in the article, which sequence is incorporated 
herein by reference, is not essential, any sequence 
which encodes for the desired oligopeptide being 
sufficient. Different sequences will be more or less 
efficiently translated. 

While the above formulas are preferred, it 
should be understood, that with suppressor mutants, 
other sequences could be provided which would result in 
the desired function. Normally, suppressor mutants are 
not as efficient for expression and, therefore, the 
above indicated sequence or equivalent sequence encoding 
for the same amino acid sequence is preferred. To the 
extent that a mutant will express from a different 
codon the same amino acids which are expressed by the 
above sequence, then such alternative sequence could be 
permitted. 

The dipep tides which are encoded for by the 
sequence in the parenthesis will be an acidic amino 
acid, aspartic or glutamic, preferably glutamic, 
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followed by a neutral amino acid, alanine and proline, 
particularly alanine. 

In providing for useful DNA sequences which 
can be used for cassettes for expression, the following 
5 sequence can be conveniently employed: 

Tr-L- ( (R-S ) r „- (GAX!TCX) nl -W- (Gene* ) d ) y 

wherein: 

Tr intends a DNA sequence encoding for the 
transcriptional regulatory signals, particularly the 

10 promoter and such other regulatory signals as operators, 
activators, cap signal, signals enhancing ribosomal 
binding, or other sequence involved with transcriptional 
or trans lational control* The Tr sequence will generally 
be at least about lOObp and not more than about 2000bp. 

15 Particularly useful is employing the Tr sequence 

associated with the leader sequence L, so that a DNA 
fragment can be employed which includes the transcrip- 
tional and translational signal sequences associated 
with the leader sequence endogenous to the host* 

20 Alternatively, one may employ other transcriptional and 
translational signals to provide for enhanced production 
of the expression product; 

d is 0 or 1, being 1 when y is greater than 

1; 

25 n' is a whole number, generally ranging from 

0 to 3, more usually being 0 or 2 to 3; 
r w is 1 or 2; 

W intends a terminal deoxyribosyl-3 • group, 
or a DNA sequence which by itself or, when n 1 is other 
30 than 0, in combination with the nucleotides to which it 
is joined, W defines a restriction site, having either 
a cohesive end or butt end, wherein W may have from 0 
to about 20 nucleotides in the longest chain; 

the remaining symbols having been defined 
3 5 previ ously . 

Of particular interest is the following 

construct: 
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( Tr ) a -L- ( R-S ) r „ - ( GA5CYCX ) n „GA ! AGCf J 

wherein: 

all of the symbols previously defined have 
the same definition; 
5 a is 0 or 1 intending that the construct may 

or may not have the transcriptional and translational 
signals; 

the nucleotides indicated in the broken box 
are intended not to be present but to be capable of 

10 addition by adding an Hindi 1 1 cleaved terminus to 

provide for the recreation of the sequence encoding for 
a dipeptide; and 

n ,f will be 0 to 2, where at least one of the 
Xs and Ys defines a nucleotide, so that the sequence in 

15 the. .parenthesis is other than the sequence GAAGCT. 

The coding sequence of Gene* may be joined to 
the terminal T, providing that the coding sequence is 
in frame with the initiation codon and upon, processing 
the first amino acid will be the correct amino acid for 

20 the mature polypeptide. 

The 3 1 -terminus of Gene* can be majiipulated 
much more easily and, therefore, it is desirable to 
provide a construct which allows for insertion of Gene* 
into a unique restriction site in the construct. Such 

25 a construct would provide for a restriction site with 
insertion of the Gene* into the restriction site to be 
in frame with the initiation codon. Such a construction 
can be symboli2ed as follows: 

(Tr ) a -L- (R-S ) r „- ( GAXYC3C ) n „-W- ( SC ) b -Te 

30 wherein: 

those symbols previously defined have the 
same definition; 

SC are stop codons; 

Te is a termination sequence balanced with 
35 the promoter Tr, and may include other signals, e.g. 
polyadenylation; and 
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b is an integer which will generally vary 
from about: 0 to 4, more usually from 0 to 3, it being 
understood, that Gene* may include its own stop codons. 

Illustrative of a sequence having the above 

5 formula is where W is the sequence GA and n" is 2. 

Of particular interest is where the sequence 

encoding the terminal dipeptide is taken together with 

W to define a linker or connector, which allows for 
recreation of the terminal sequence defining the 

10 dipeptide of the processing signal and encodes for the 
initial amino acids of Gene*, so that the codons are in 
frame with the initiation codon of the leader. The 
linker provides for a staggered or butt ended termina- 
tion, desirably defining a restriction site in conjunc- 

15 tion with the "Successive sequences of the Gene*. Upon 
ligation of the linker with Gene*, the codons of Gene* 
will be in frame with the initiation codon of the 
leader. In this manner, one can employ a synthetic 
sequence which may be joined to a restriction site in 

20 the processing signal sequence to recreate the proces- 
sing signal, while providing the initial bases of the 
Gene* encoding for the N- terminal amino acids. By 
employing a synthetic sequence, the synthetic linker 
can be a tailored connector having a convenient restric- 

25 tion site near the 3 '-terminus and the synthetic 

connector will then provide for the necessary codons 
for the 5 '-terminus of the gene. 

Alternatively, one could introduce a restric- 
tion endonuclease recognition site downstream from the 

30 processing signal to allow for cleavage, and removal of 
superfluous bases to provide for ligation of the Gene* 
to the processing signal in frame with the initiation 
codon. Thus the first codon would encode for the 
N- terminal amino acid of the polypeptide. Where T is 

35 the first base of Gene*, one could introduce a restric- 
tion site where the recognition sequence is downstream 
from the cleavage site. For example, a Sau3 A recogni- 




0116201 

9 

tion sequence could be introduced immediately after the 
processing signal, which would allow for cleavage aid 
linking of the Gene* with its initial codon in frame 
with the leader initiation codon. With restriction 
5 endonucleases which have the recognition sequence 

distal and downstream from the cleavage site e.g. Hga l, 
W could define such sequence which could include a 
portion of the processing signal sequences. Other 
constructions can also be employed, employing such 
10 techniques as primer repair and in vitro mutagenesis to 
provide for the convenient insertion of Gene* into the 
construct by introducing an appropriate restriction 
site. 

The construct provides a portable sequence 

15 for insertion into vectors, which provide the desired 
replication system. As already indicated, in some 
instances, it may be desirable to replace the wild type 
promoter associated with the leader sequence with a 
different promoter. In yeast, promoters involved with 

20 enzymes in the glycolytic pathway can provide for high 
rates of transcription. These promoters are associated 
with such enzymes as phosphoglucoisomerase, phos- 
phofructokinase, phosphotriose isomer ase, phbspho- 
glucomutase, enolase, pyruvic kinase, glyceraldehyde-3- 

25 phosphate dehydrogenase, and alcohol dehydrogenase. 
These promoters may be inserted upstream from the 
leader sequence. The 5 '-flanking region to the leader 
sequence may be retained or replaced with the 3»- 
sequence of the alternative promoter. Vectors can be 

30 prepared and have been reported which include promoters 
having convenient restriction sites downstream from the 
promoter for insertion of such constructs as described 
above. 

The final construct will be an episomal 
35 element capable of stable maintenance in a host, 

particularly a fungal host such as yeast. The construct 
will include one or more replication systems, desirably 
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two replication systems, allowing for maintenance in 
the expression host and cloning in a prokaryote. In 
addition, one or more markers for selection will be 
included, which will allow for selective pressure for 
5 maintenance of the episomal element in the host. 

Furthermore, the episomal element may be a high or low 
copy number, the copy number generally ranging from 
about 1 to 200. With high copy number episomal elements, 
there will generally be at least 10, preferably at 

10 least 20, and usually not exceeding about 150, more 

usually not exceeding about 100 copy number. Depending 
upon the Gene*, either high or low copy numbers may be 
desirable, depending upon the effect of the episomal 
element on the host. Where the presence of the expres- 

15 sion product of the episomal element may have a dele- 
terious effect on the viability of the host, a low copy 
number may be indicated. 

Various hosts may be employed, particularly 
mutants having desired properties. It should be 

20 appreciated that depending upon the rate of production 
of the expression product of the construct, the pro- 
cessing enzyme may or may not be adequate for process- 
ing at that level of production. Therefore, a mutant 
having enhanced production of the processing enzyme may 

25 be indicated or enhanced production of the enzyme may 

be provided by means of an episomal element. Generally, 
the production of the enzyme should be of a lower order 
than the production of the desired expression product. 

Where one is using a- factor for secretion and 

30 processing, it would be appropriate to provide for 

enhanced production of the processing enzyme Dipeptidyl 
Amino Peptidase A, which appears to be the expression 
product of STE13 . This enzyme appears to be specific 
for X-Ala- and X-Pro-sequences, where X, in this instance 

35 intends an amino acid, particularly, the dicarboxylic 
acid amino acids. 
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Alternatively, there may be situations where 
intracellular processing is not desired. In this 
situation, it would be useful to have a ste!3 mutant, 
where secretion occurs, but the product is not pro- 
5 cessed. In this manner, the product may be subse-' 
guentally processed in vitro . 

Host mutants which provide for controlled 
regulation of expression may be employed to advantage. 
For example, with the constructions of the subject 

10 invention where a fused protein is expressed, the 

transformants have slow growth which appears to be a 
result of toxicity of the fused protein. Thus, by 
inhibiting expression during growth, the host may be 
grown to high density before changing the conditions to 

15 permissive conditions for expression. 

A temperature-sensitive sir mutant may be 
employed to achieve regulated expression. Mutation in 
any of the SIR genes results in a non-mating phenotype 
due to in situ expression of the normally silent MATa 

20 and MATa sequences present at the HML and HMR loci. 

Furthermore, as already indicated, the Gene* 
may have a plurality of sequences in tandem, either the 
same or different sequences, with intervening processing 
signals. In this manner, the product may be processed 

25 in whole or in part, with the result that one will 

obtain the various sequences either by themselves or in 
tandem for subsequent processing. In many situations, 
it may be desirable to provide for different sequences, 
where each of the sequences is a subunit of a particular 

30 protein product. 

The Gene* may encode for any type of polypep- 
tide of interest. The polypeptide may be as small as 
an oligopeptide of 8 amino acids or may be 100,000 
daltons or higher. Usually, single chains will be less 

35 than about 300,000 daltons, more usually less than 
about 150,000 daltons. Of particular interest are 
polypeptides of from about 5,000 to 150,000 daltons, 
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more particularly of about 5,000 to 100,000 daltons. 
Illustrative polypeptides of interest include hormones 
and factors, such as growth hormone, somatomedins 
epidermal growth factor, the endocrine secretions, such 
5 as luteinizing hormone, thyroid stimulating hormone, 
oxytocin, insulin, vasopressin, renin, calcitonin, 
follicle stimulating hormone, prolactin, etc.; hemato- 
poietic factors, e.g. erythropoietin, colony stimulating 
factor, etc.; lymphokines; globins; globulins, e.g;- 
10 immunoglobulins; albumins; interferons, such as a, p 

and y; repressors; enzymes; endorphins e.g. p -endorphin, 
enkephalin, dynorphin, etc. 

Having prepared the episomal elements con- 
taining the constructs of this invention, one may then 
15 introduce such element into an appropriate host. The 
manner of introduction is conventional, there being a 
wide variety of ways to introduce DNA into a host. 
Conveniently, spheroplasts are prepared employing the 
procedure of, for example, Hinnen et al. , PNAS USA 
20 (1978) 75:1919-1933 or Stinchcomb et al., EP 0 045 573 
A2. The trans f ormants may then be grown in an appro- 
priate nutrient medium and where appropriate, maintaining 
selective pressure on the trans f ormants . Where expres- 
sion is inducible, one can allow for growth of the 
25 yeast to high density and then induce expression. In 
those situations, where a substantial proportion of the 
' product may be retained in the periplasmic space, one 
can release the product by treating the yeast cells 
with an enzyme such as zymolase or lyticase. 
30 The product may be harvested by any conve- 

nient means, purifying the protein by chromatography, 
electrophoresis, dialysis, solvent-solvent extraction, 
etc. 

In accordance with the subject invention, one 
35 can provide for secretion of a wide variety of polypep- 
tides, so as to greatly enhance product yield, simplify 
purification, minimize degradation of the desired 
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product, and simplify processing, equipment, and 
engineering requirements. Furthermore, utilization of 
nutrients based on productivity can be greatly enhanced, 
so that more economical and more efficient production 
5 of polypeptides may be achieved. Also, the use of-":" 
yeast has many advantages in avoiding enterotoxins, 
which may be present with prokaryotes, and employing 
known techniques, which have been developed for yeast 
over long periods of time, which techniques include 
10 isolation of yeast products. 

The following examples are offered by way of 
illustration and not by way of limitation. 

EXPERIMENTAL 
A synthetic sequence for human epidermal 
15 growth factor (EGF) based on the amino acid sequence of 
EGF reported by H. Gregory and B.M. Preston Int. J. 
Peptide Protein Res. 9, 107-118 (1977) was prepared, 
which had the following sequence. 

r • 

5 1 AACTCCGACTCCGAATGTCCATTGTCCCAC6ACGGTTACTGTTTGCACGACGCT 
20 3* TT<^6GCTGAGGCTTACAGGTAACAGGGTGCTGCCAATG^ 

ATGTACATCGAAGCTTTGGACAAGTACGCTTGTAACTGTGTTGTTG6TTA 
TACATGTAGCTTCGAMCCTGTTCATGCGAACATT 

AGATGTCAATACAGAGACTTGAAGTGGTGGGAATTGAGATGA 
TCTACAGTTATGTCTCTGMCTTCACCACCCTTMCTCT , 

25 where 5 1 indicates the promoter proximal end of the 
sequence. The sequence was inserted into the EcoRI 
site of pBR328 to produce a plasmid p328EGF-l and \ 
cloned. 

Approximately 30jjg of p328EGF-l was digested 
30 with EcoR I and approximately ljjg of the expected 190 
base pair EcoRI fragment was isolated. This was 
followed by digestion with the restriction enzyme Bga l. 



j 

0116201 

14 

Two synthetic oligonucleotide connectors Hind lll- Hga l 
and Hgal - Sal l were then ligated to the 159 base pair 
' Hga l fragment. The Hga l- Hind lll linker had the following 

sequence: 
5 AGCTGAAGCT 

CTTCGATTGAG 

This linker restores the a- factor processing signals 
interrupted by the Hind i 1 1 digestion and joins the Hga l 
j end at the 5 1 -end of the EGF gene to the Hindi 1 1 end 

10 ofpAB112. 

The Hgal -Sail linker had the following 

sequence: 

TGAGATGATAAG 

ACTATTCAGCT 

15 This linker has two stop codons and joins the Hgal end 
at the 3 1 -end of the EGF gene to the Sai l end of 
pAB112. 

The resulting 181 base pair fragment was 
purified by preparative gel electrophoresis and ligated 

20 to lOOng of pAB112 which had been previously completely 

digested with the enzymes Hind i 1 1 and Sail . Surprisingly, 
a deletion occurred where the codon for the 3rd and 4th 
amino acids of EGF, asp and ser r were deleted, with the 
remainder of the EGF being retained* 

25 pAB112 is a plasmid containing a 1.75kb EcoR I 

fragment with the yeast a-f actor gene cloned in the 
EcoRI site of pBR322 in which the Hindi 1 1 and Sail 
sites had been deleted (pABll). pAB112 was derived from 
plasmid pABlOl which contains the yeast o-factor gene 

30 as a partial Sau3 A fragment cloned in the BamHI site of 
plasmid YEp24. pABlOl was obtained by screening a 
yeast genomic library in YEp24 using a synthetic 20-mer 
oligonucleotide probe (3 1 -GGCCGGrTGCnXACKLGKIT-S * ) 
homologous to the published a- factor coding region 

35 (Kurjan and Herskowitz, Abstracts 1981 Cold Spring 
Harbor meeting on the Molecular Biology of Yeasts, 
page 242). 
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The resulting mixture was used to transform 
E. coli HB101 cells and plasmid pAB201 obtained. 
Plasmid pAB201 (5pg) was digested to completion with 
the enzyme EcoR I and the resulting fragments were: 
5 a) filled in with DNA polymerase I Klenow fragment; 

b) ligated to an excess of BamH I linkers; and 

c) digested with BamH I » The 1.75kbp EcoR I fragment was 
isolated by preparative gel electrophoresis and 
approximately lOOng of the fragment was ligated to 

10 lOOng of pCl/1, which had been previously digested to 
completion with the restriction enzyme BamH I and 
treated with alkaline phosphatase. 

Plasmid pCl/1 is a derivative of pJDB219, 
Beggs, Nature (1978) 275:104, in which the region 
15 corresponding to bacterial plasmid pMB9 in pJDB219 has 
been replaced by pBR322 in pCl/1. This mixture was 
used to transform E. coli HB101 cells. Trans formants 
were selected by ampicillin resistance and their 
plasmids analyzed by restriction endonucleases . DNA 
20 from one selected clone (pYEGF-8) was prepared and used 
to transform yeast AB103 cells. Trans formants were 
selected by their leu + phenotype. 

Fifty milliliter cultures of yeast strain 
AB103 (a, pep 4-3 , leu 2-3, leu 2-112 r ura 3-52, his 
25 4-580) transformed with plasmid pYEGF-8 (deposited at 
the American Type Culture Collection on 5th January 
1983 and given ATCC Accession no, 20658) were grown at 
30° in -leu medium to saturation (optical density at 
600nm of 5) and left shaking at 30° for an additional 
30 12 hr period. Cell supernatants were collected by 

centrifugation and analyzed for the presence of human 
EGF using the fibroblast receptor competition binding 
assay. The assay of EGF is based on the ability of 
both mouse and human EGF to compete with 125 I-labeled 
35 ! mouse EGF for binding sites on human foreskin fibro- 
blasts . Standard curves can be obtained by measuring 
the effects of increasing quantities of EGF on the bind- 

• 125 

mg of a standard amount of I-labeled mouse EGF. 
Under these conditions 2 to 20 ng of EGF are readily 
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measurable. Details on the binding of 125 I-labeled 
epidermal growth factor to human fibroblasts have been 
described by Carpenter et al., J. Biol . Chem . 250 , 4297 
(1975). Using this assay it is found that the culture 
5 medium contains 7±lmg of human EGF per liter. 

For further characterization, human EGF 
present in the supernatant was purified by absorption 
to the ion-exchange resin Biorex-70 and elution with 
HC1 lOmM in 80% ethanol. After evaporation of the HCl 

10 and ethanol the EGF was solubilized in water* This 
material migrates as a single major protein of MW 
approx. 6/000 in 17.5% SDS gels, roughly the same as 
authentic mouse EGF (MW~6,000). This indicates that 
the cr-factor leader sequence has been properly excised 

15 during the secretion process. Analysis by high resolu- 
tion liquid chromatography (microbondapak C18, Waters 
column) indicates that the product migrates with a 
retention time similar to an authentic mouse EGF 
standard. However, protein sequencing by Edman degrada- 

20 tion showed that the N-terminus retained the glu-ala 
sequence. 

A number of other constructions were prepared 
using different constructions for joining hEGF to the 
a- factor secretory leader sequence, providing for 

25 different processing signals and site mutagenesis. In 
Fig. 2 a. through e. show the sequence of the fusions at 
the N- terminal region of hEGF, which sequence differ 
among several constructions, f. shows the sequences at 
the C-terminal region of hEGF, which is the same for all 

30 constructions. Synthetic oligonucleotide linkers used 
in these constructions cure boxed. 

These fusions were made as follows. Construc- 
tion (a) was made as described above. Construction (b) 
was made in a similar way except that linker 2 was used 

35 instead of linker 1. linker 2 modifies the a- factor 

processing signal by inserting an additional processing 
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site (ser-leu-asp-lys-arg) immediately preceding the 
hEGF gene. The resulting yeast plasmid is named 
pYctEGF-22 . Construction (c), in which the dipeptidyl 
aminopeptidase maturation site (glu-ala) has been removed, 
5 was obtained by in vitro mutagenesis of construction 
(a). A Pst l- Sal l fragment containing the o- factor 
leader-hEGF fusion was cloned in phage M13 and isolated 
in a single-stranded form. A synthetic 31-mer of 
sequence 5 1 -TCTTTGGATAAAAGAAACTCCGACTCCCG-3 1 was 

10 synthesized and 70 picomoles were used as a primer for 
• the synthesis of the second strand from 1 picomole of 
the above template by the Klenow fragment of DNA 
polymerase. After fill-iri and ligation at 14° for 18 
hrs, the mixture was treated with S ± nuclease (5 units 

15 -for 15 min) and used to transfect E. coli JM101 cells. 
Bacteriophage containing DNA sequences in which the 
region coding for (glu-ala) was removed were located by 
filter plaque hybridization using the 32 P-labeled 
primer as probe. RF DNA from positive plaques was 

20 isolated, digested with PstI and Sai l and the resulting 
fragment inserted in pAB114 which had been previously 
digested to completion with Sai l and partially with 
PstI and treated with alkaline phosphatase. 

The plasmid pAB114 was derived as follows: 

25 plasmid pAB112 was digested to completion with Hindi 1 1 
and then religated at low (4pg/ml) DNA concentration 

- - and plasmid pAB113 was obtained in which three 63bp 
Hindi 1 1 fragments have been deleted from the o- factor 
structural gene, leaving only a single copy of mature 

30 o- factor coding region. A BamHI site was added to 

plasmid pABll by cleavage with EcoRI, filling in of the 
overhanging ends by the Klenow fragment of DNA 
polymerase, ligation of BamHI linkers, cleavage with 
BamHI and religation to obtain pAB12. Plasmid pAB113 

35 was digested with EcoRI, the overhanging ends filled 

in, and ligated to BamHI linkers. After digestion with 
BamHI the 15 OObp fragment was gel-purified and ligated 
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to pAB12 which had been digested with BamH I and treated 
with alkaline phosphatase. Plasmid pAB114 # which 
contains a 1500bp BamH I fragment carrying the a-factor 
gene, was obtained. The resulting plasmid (pAB114 
5 containing the above described construct) is then 
digested with BamH I and ligated into plasmid pCl/1. 
1 The resulting yeast plasmid is named pYrtEGF-23 

and was deposited at the American Type Culture Collection 
on 12th August 1983 under ATCC Accession no. 40079. 

10 Construction (d) r in which a new Kpn l site 

was generated , was made as described for construction 
(c) except that the 36-mer oligonucleotide primer of 
sequence 5 1 -GGGTACCTTTGGATAAAAGAAACTCCGACTCCGAAT-3 1 was 
used. The restating yeast plasmid is named pYoEGF-24. 

15 Construction (e) was derived by digestion of the 

plasmid containing construction (d) with Kpn l and" Sai l 
instead of linker 1 and 2. The resulting yeast plasmid 
is named pYaEGF-25. 

Yeast cells transformed with pYcrEGF-22 were 

20 grown in 15 ml cultures. At the indicated densities or 
times, cultures were centrifuged and the supernatants 
saved and kept on ice. The cell pellets were washed in 
lysis buffer (0.1 Triton X-100, lOmM NaHP0 4 pH 7.5) and 
broken by vortexing (5min in lmin intervals with 

25 cooling on ice in between) in one volume of lysis 

buffer and one volume of glass beads. After centrifuga- 
tion, the supernatants were collected and kept on ice. 
The amount of hEGF in the culture medium and cell 
extracts was measured using the fibroblast receptor 

30 binding competition assay. Standard curves were 

obtained by measuring the effects of increasing quan- 
tities of mouse EGF on the binding of a standard amount 
125 

I -labeled mouse EGF. 

Proteins were concentrated from the culture 
35 media by absorption on Bio-Rex 70 resin and elution 
with 0*01 HC1 in 80% ethanol and purified by high 
performance liquid chromatography (HPLC) on a reverse 
phase CIS column. The column was eluted at a flow rate 
of 4ml/min with a linear gradient of 5% to 80% aceto- 
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nitrile containing 0.2% trifluoroacetic acid in 60min. 
Proteins (200-800 picomoles) were sequenced at the 
amino-tenninal end by the Edman degradation method 
using a gas-phase protein sequencer Applied Biosystems 
5 model 47 OA. The normal PROTFA program was used for all 
the analyses. Dithiothreitol was added to S2 (ethyl 
acetate: 20mg/liter) and S3 (butyl chloride: lOmg/liter) 
immediately before use. All samples were treated with 
IN HC1 in methanol at 40° for 15min to convert PTH- 
10 aspartic acid and PTH-glutamic acid to their methyl 
esters. All PTH-amino acid identifications were 
performed by reference to retention times on a IBM CN 
HPLC column using a known mixture of PTH-amino acids as 
standards . 

15 Secretion from pYctEGF-22 gave a 4:1 mole 

ratio of native N-terminus hEGF to glu-ala terminated 
hEGF, while secretion from pYaEGF-23-25 gave only 
native N-terminated hEGF. Yields of hEGF ranged from 5 
to 8Mg/ml measured either as protein or in a receptor 

20 binding assay. 

The strain JRY188 (MAT sir3-8 leu2-3 leu2-112 
tr P 1 HES3 his4 rme) was transformed with pYcrEGF-21 and 
leucine prototrophs selected at 37°. Saturated 
. cultures were then diluted 1/100 in fresh medium and 

25 grown in leucine selective medium at permissive (24°) 
and non-permissive (36°) temperatures and culture 
supernatahts were assayed for the presence of hEGF as 
described above. The results are shown in the 
following table. 
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Regulated synthesis and secretion of hEGF in transformed 
yeast sir3 temperature-sensitive mutants. 



20 



Temperature 


Trans f ormant 


O D 650 


1 1 nuc \ |j y / ill I j 


36° 


3a 


3.5 


0.010 






5.4 


0.026 




3b 


3.6 


0.020 






6.4 


0.024 


24° 


3a 


0.4 


34 






1.3 


145 






2.1 


1075 






4.0 


3250 




3b 


0.4 


32 






1.4 


210 






2.2 


1935 






4.2 


4600 



These results indicate that the hybrid 
c^-factor/EGF gene is being expressed under mating type 
regulation, even though it is present on a high copy 
number plasmid. 

25 in accordance with the subject invention, 

novel constructs are provided which may be inserted 
into vectors to provide for expression of polypeptides 
having an N-terminal leader sequence and one or more 
processing signals to provide for secretion of the 

30 polypeptide as well as processing to result in a mature 
polypeptide product free of superfluous amino acids. 
Thus, one can obtain a polypeptide having the identical 
sequence to a naturally occurring polypeptide. In 
addition, because the polypeptide can be produced in 

35 yeast, glycosylation can occur, so that products can be 
obtained which are identical to the naturally occurring 
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products. Furthermore, because the product is secreted, 
greatly enhanced yields can be obtained based on cell 
population and processing and purification are greatly 
simplified. In addition, employing mutant hosts, 
5 expression can be regulated to be turned off or on, as 
desired. 

Although the foregoing invention has been 
described in some detail by way of illustration and 
example for purposes of clarity of understanding, it 
10 will be obvious that certain changes and modifications 
may be practiced within the scope of the appended 
claims. 
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CLAIMS 



1- A DNA construct encoding a pre-pro-poly- 
peptide, said DMA construct encoding pre-pro-polypeptide 
comprising a yeast leader sequence, processing signals 
5 for processing the pre-pro-polypeptide to a mature poly- 
peptide and a gene encoding a polypeptide other than the 
wild type gene associated with said leader sequence. 

2* A DNA construct: according to Claim 1, 
including at the 5 1 end of the sequence a yeast; pro- 
10 moter and wherein said gene is heterologous to said 
yeast host. 

3. A DNA construct according to Claim 2, 
wherein said yeast promoter is the cr-factor promoter 
and said yeast leader is a leader sequence encoding for 

15 at least a major portion of the a -factor leader and is 
capable of providing for secretion. 

4. A DNA construct according to Claim 2, 
wherein said gene is a mammalian gene. 

5. A DNA construct comprising a sequence of 
20 the following formula: 

L-( (R) r -(GAXZCX) n -Gene*) y 

wherein: 

L is a leader sequence recognized by yeast 
for secretion; 

25 R is a codon coding for arginine or lysine; * 

r is an integer of from 2 to 4; 

X is any nucleotide; 

Y is guanosine or cytosine; 

y is an integer of from about 1 to 10; 
30 Gene* is a gene foreign to yeast; and 

n is 0 or 1 to 4. 
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6, A DNA construct according to Claim 5, 
wherein n is 0 to 4 and the nucleotides of said Gene* 
proximal to R at least in part define a recognition 
site for a restriction endonuclease. 

5 7. A DNA construct according to Claim 6, 

wherein said leader sequence is the o-factor leader 
sequence. 

8. A DNA construct according to Claim 7, 
wherein n is 0. 

10 9- A DNA construct of the formula: 

Tr-L- ( R-S- ( GAXYCX ) n t ( Gene* ) d ) 

wherein: 

Tr is a sequence having transcriptional and 
translational regulatory signals for initiation and 
15 processing of transcription and translation, wherein 
said regulatory signals are recognized by yeast; 

L is a leader sequence for secretion by 

yeast; 

R and S are codons expressing arginine and 

20 lysine; 

X is any nucleotide; 

Y is cytosine or guanosine; 

y is an integer of from 1 to 4; 

n 1 is a whole number of from 0 to 4; 
25 W is a deoxyribosyl-3 1 group or when n 1 is 

other than 0, one or more nucleotides which by themselves 
or together with the hexanucleotide in the parenthesis 
define a restriction site; 

Gene* either by itself or taken together with 
30 W defines a polypeptide sequence foreign to yeast; and 

d is 0 or 1, being 1, when y is greater than 

1. 
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10 ♦ A DNA construct according to Claim 9, 
wherein Tr is a sequence defining the regulatory 
signals for ot-factdr, d is 1 and Gene* and W are taken 
together to define a polypeptide foreign to yeast. 

5 11. A DNA construct according to Claim 9 

wherein n ! is 0. 

12. A DNA construct according to Claim 11, 
wherein said polypeptide product is a mammalian poly- 
peptide* 

10 13* A DNA construct comprising a sequence of 

the formula: 

"(Tr) a -L-R-S-(GAXTCX)^ II GA!AGCf ! 

wherein: 

Tr is a sequence defining transcriptional and 
15 trans 1 at i onal regulatory signals for initiation and 

processing of transcription and translation recognized 
by yeast; 

a is 0 or 1; 

L is a leader sequence recognized by yeast; 
20 R and S are codons encoding for lysine and 

arginine; 

X is any nucleotide; 

Y is cytosine or guanosine; 

n" is 2 to 4; 

25 the nucleotides in the broken box indicate 

the nucleotides which are complementary to the overhang 
of the non-coding chain to define a Hind lll restriction 
site. 
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14 . A DNA construct according to claim 13, 
where Tr is a sequence defining the transcriptional and 
translational regulatory signals of cr-f actor. 

15. An expression episomal element compris- 
5 ing a replication system for providing stable mainte- 
nance in yeast and a sequence of the formula: 
Tr-L- (R) r , - (GAXYCX) nt -W-Te 

wherein: 

Tr is a sequence defining transcriptional and 
10 translational regulatory signals for initiation and 
processing of transcription and translation in yeast; 

L is a leader sequence recognized by yeast 
for secretion; 

R is a codon defining arginine or lysine; 
15 r 1 is a whole number in the range of 2 to 4; 

X is any nucleotide; 

Y is cytosine or guanos ine; 

n f is a whole number in the range of 0 to 4; 

W is a nucleotide sequence of at least 1 
20 nucleotide, which by itself or when n ! is other than 0, 
in conjunction with nucleotides in the parenthesis 
defines a restriction site; 

Te is a sequence defining a terminator 
balanced with said transcriptional initiator sequence. 

25 t 16 . An expression episomal element according 

to Claim 15 wherein Tr is derived from a-f actor and n f 
is 2 to 3/ 

17. An expression episomal element according 
to Claim 14, wherein Tr is derived from a- factor and n' 
30 is 0. 



18. An episomal expression vector according 
Claim 17, having a gene foreign to yeast intermediate 
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R and Te and in reading frame with the initiation codon 
of L. 

19. An episomal expression element according 
to Claim 18, wherein said gene is a mammalian gene. 

5 20. An episomal element according to Claim 

19, wherein said mammalian gene is human epidermal 
growth factor. 

21. An episomal expression vector according 
to Claim 16, having a gene foreign to yeast intermediate 

10 the nucleotides in the parentheses and Te and in 
reading frame with the initiation codon of L. 

22. A method for producing a polypeptide 
foreign to yeast and having such polypeptide secreted 
into the culture medium, said method comprising: 

15 growing yeast containing an episomal expres- 

sion elements according to Claim 16, whereby the 
encoding sequences are expressed to produce a pre-pro- 
polypeptide? and 

said pre-pro-polypeptide is at least partially 

20 processed and secreted. 

23. An episomal expression vector according 
to Claim 17," having a gene foreign to yeast intermediate 
the nucleotides in the parentheses and Te and in 
reading frame with the initiation codon of L. 

25 24. A method for producing a polypeptidie 

foreign to yeast and having such polypeptide secreted 
into the culture medium, said method comprising: 

growing yeast mutants containing an episomal 
expression element according to Claim 16, wherein said 

30 mutant permits external regulation of expression, 
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whereby the encoding sequences are expressed to produce 
a pre-pro-polypeptide under permissive conditions; and 

said pre-pro-polypeptide is at least partially 
processed and secreted. 

5 25. A method according to Claim 24, wherein 

said mutant yeast is a temperature-sensitive sir 
mutant. 

26. A method for producing a polypeptide 
foreign to yeast and having such polypeptide secreted 
into the culture medium, said method comprising: 

growing yeast mutants containing an episomal 
expression element according to Claim 17, wherein said 
mutant permits external regulation of expression, 
whereby the encoding sequences are expressed to produce 
a pre-pro-polypeptide under permissive conditions; and 

said pre-pro-pdlypeptide is at least partially 
processed and secreted. 

27. A method according to Claim 26, wherein 
said mutant yeast is a temperature-sensitive sir 

20 mutant. 
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Description 

BACKGROUND OF THE INVENTION 
5 Field of the Invention 

Hybrid DNA technology has revolutionized the ability to produce polypeptides of an infinite variety of 
compositions. Since living forms are composed of proteins and employ proteins for regulation, the ability to 
duplicate these proteins at will offers unique opportunities for investigating the manner in which these 
jo proteins function and the use of such proteins, fragments of such proteins, or analogs in therapy and 
diagnosis. 

There have been numerous advances in improving the rate and amount of protein produced by a cell. 
Most of these advances have been associated with higher copy numbers, more efficient promoters, and 
means for reducing the amount of degradation of the desired product. Is is evident that it would be 
?s extremely desirable to be able to secrete polypeptides of interest, where such polypeptides are the product 
of interest. 

Furthermore, in many situations, the polypeptide of interest does not have an initial methionine amino 
acid. This is usually a result of there being a processing signal in the gene encoding for the polypeptide of 
interest, which the gene source recognizes and cleaves with an appropriate peptidase. Since in most 

20 situations, genes of interest are heterologous to the host in which the gene is to be expressed, such 
processing occurs imprecisely and in low yield in the expression host. In this case, while the protein which 
is obtained will be identical to the peptide of interest for almost all of its sequence, it will differ at the N- 
terminus which can deleteriously affect physiological activity. 

There are, therefore, many reasons why it would be extremely advantageous to prepare DNA 

25 sequences, which would encode for the secretion and maturing of the polypeptide product. Furthermore, 
where sequences can be found for processing, which result in the removal of amino acids superfluous to 
the polypeptide of interest, the opportunity exists for having a plurality of DNA sequences, either the same 
or different, in tandem, which may be encoded on a single transcript. 

30 Description of the Prior Art 

U.S. Patent No. 4,336,336 describes for prokaryotes the use of a leader sequence coding for a non- 
cytoplasmic protein normally transported to or beyond the cell surface, resulting in transfer of the fused 
protein to the periplasmic space. U.S. Patent No. 4,338,397 describes for prokaryotes using a leader 

35 sequence which provides for secretion with cleavage of the leader sequence from the polypeptide 
sequence of interest. U.S. Patent No. 4,338,397, columns 3 and 4, provide for useful definitions, which 
definitions are incorporated herein by reference. 

Kurjan and Herskowitz, Cell (1982) 30:933-943 describes a putative a-factor precursor containing four 
tandem copies of mature a-factor, describing the sequence and postulating a processing mechanism. 

40 Kurjan and Herskowitz, Abstracts of Papers presented at the 1981 Cold Spring Harbor meeting on The 
Molecular Biology of Yeasts, page 242, in an Abstract entitled, "A Putative o-Factor Precursor Containing 
Four Tandem Repeats of Mature a-Factor," describe the sequence encoding for the a-factor and spacers 
between two of such sequences. Blair et al.. Abstracts of Papers, ibid, page 243, in an Abstract entitled 
"Synthesis and Processing of Yeast Pheremones: Identification and" Characterization of Mutants That 

45 Produce Altered a-Factors," describe the effect of various mutants on the production of mature a-factor. 

SUMMARY OF THE INVENTION 

The subject matter of the invention is defined in the claims. 

so Methods and compositions are provided for producing mature polypeptides. DNA constructs are 
provided which join the DNA fragments encoding for a yeast leader sequence and processing signal to 
heterologous genes for secretion and maturation of the polypeptide product. The construct of the DNA 
encoding for the N-terminal cleavable alpha-factor leader (or fragments or analogs thereof) and the DNA 
sequence encoding for the mature polypeptide product can be joined to appropriate vectors for introduction 

55 into yeast or other cell which recognizes the processing signals for production of the desired polypeptide. 
Other capabilities may also be introduced into the construct for various purposes. 

BRIEF DESCRIPTION OF THE DRAWINGS 
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Fig. 1 is a flow diagram indicating the construction of pYaEGF-21. 

Rg. 2 shows sequences at fusions of hEGF to the vector, a. through e. show the sequences at the N- 
terminal region of hEGF, which differ among several constructions and f. shows the C-terminal region of 
hEGF. 

5 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

In accordance with the subject invention, eukaryotic hosts, particularly yeast are employed for the 
production of mature polypeptides, where such polypeptides may be harvested from a nutrient medium. 
w The polypeptides are produced by employing a DNA construct encoding for an alpha-factor leader (or 
fragments or analogs thereof) and processing signals joined to a polypeptide of interest, which may be a 
single polypeptide or a plurality of polypeptides separated by processing signals. The resulting construct 
encodes for a pre-pro-polypeptide which will contain the signals for secretion of the pre-pro-polypeptide and 
processing of the polypeptide, either intracellular^ or extraceilularly to the mature poypeptide. 
15 According to a preferred embodiment of the invention, there is provided a DNA construct comprising a 
sequence comprising the formula: 

5 f -Tr-L-SP-Gene*-Te-3' 

wherein: 

Tr is a yeast promoter sequence; 

L encodes at least a yeast alpha-factor leader sequence fragment that provides for secretion; 
Sp is a spacer sequence encoding processing signals for processing the precursor polypeptide 
encoded by L-Sp-Ge-ne* into the polypeptide encoded by Gene*; 

Gene* encodes a polypeptide foreign to yeast; and Te is a transcription termination sequence balanced 
with Tr. 

Furthermore, a DNA construct of the above formula is preferred wherein Sp contains or is composed of the 
sequence 5'-Ri-R2-3' immediately adjacent to the sequence Gene*, Ri being a codon for lysine or arginine. 
Ra being a codon for arginine but does not encode a processing signal for dipeptidylaminopeptidase A. 

The constructs of the subject invention will for example have at least the following formula defining a 
propolypeptide: 

((RMGAXYCXJn-Gene*^ 

35 wherein: 

R is CGX or AZZ, the codons coding for lysine and arginine, each of the Rs being the same or 
different; 

r is an integer of from 2 to 4, usually 2 to 3. preferably 2 or 4; 
X is any of the four nucleotides, T t G, C, or A; 
40 Y is G or C; 

y is an integer of at least one and usually not more than 10, more usually not more than four, providing 
for monomers and multimers; 
Z is A or G; and 

Gene* is a gene other than a-factor t usually foreign to a yeast host, usually a heterologous gene, 
45 desirably a plant or mammalian gene; 

n is 0 or an integer which will generally vary from 1 to 4, usually 2 to 3. 

The pro-polypeptide has an N-terminal processing signal for peptidase removal of the amino acids 
preceding the amino acids encoded for by Gene*. 

For the most part, the constructs of the subject invention will have at least the following formula; 

L-(R-S-(GAXYCX)„>-Gene*)y 

defining a pre-pro-polypeptide, wherein all the symbols except L and S have been defined. S having 
the same definition as R. there being 1R and 1S, and L is an alpha-factor leader sequence (or fragment or 
55 analog thereof) providing for secretion of the pre-pro-polypeptide. While it is feasible to have more Rs and 
Ss there will usually be no advantage in the additional amino acids. Any alpha-factor leader sequence (or 
fragment or analog thereof) may be employed which provides for secretion, leader sequences generally 
being of about 30 to 120 amino acids, usually about 30 to 100 amino acids, having a hydrophobic region 
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and having a methionine at its N-terminus. 

The construct when n is 0 will have the following formula: 

M(RV- GeneTy 

5 

defining a pre-pro-polypeptide. wherein all the symbols have been defined previously, except r\ wherein: 
r' is 2 to 4, preferably 2 or 4. 

Of particular interest is the leader sequence of a-factor which is described in Kurjan and Herskowitz, 
supra , on page 937 or fragments or analogs thereof, which provide for efficient secretion of the desired 
jo polypeptides. Furthermore, the DNA sequence indicated in the article, which sequence is incorporated 
herein by reference, is not essential, any sequence which encodes for the desired oligopeptide being 
sufficient. Different sequences will be more or less efficiently translated. 

While the above formulas are preferred, it should be understood, that with suppressor mutants, other 
sequences could be provided which would result in the desired function. Normally, suppressor mutants are 
75 not as efficient for expression and, therefore, the above indicated sequence or equivalent sequence 
encoding for the same amino acid sequence is preferred. To the extent that a mutant will express from a 
different codon the same amino acids which are expressed by the above sequence, then such alternative 
sequence could be permitted. 

The dipeptides which are encoded for by the sequence in the parenthesis will be an acidic amino acid, 
20 aspartic or glutamic, preferably glutamic, followed by a neutral amino acid, alanine and proline, particularly 
alanine. 

In providing for useful DNA sequences which can be used for cassettes for expression, the following 
sequence can be conveniently employed: 

25 Tr-L-WR-SV^GAXYCX^-W^Gene^Jy 

wherein: 

Tr intends a DNA sequence encoding for the transcriptional regulatory signals, particularly the promoter 
and such other regulatory signals as operators, activators, cap signal, signals enhancing ribosomal binding, 
30 or other sequence involved with transcriptional or translational control. The Tr sequence will generally be at 
least about 100bp and not more than about 2000bp. Particularly useful is employing the Tr sequence 
associated with the leader sequence L, so that a DNA fragment can be employed which includes the 
transcriptional and translational signal sequences associated with the leader sequence endogenous to the 
host. Alternatively, one may employ other transcriptional and translational signals to provide for enhanced 
35 production of the expression product; 

d is 0 or 1, being 1 when y is greater than 1; 

n* is a whole number, generally ranging from 0 to 3, more usually being 0 or 2 to 3; 
r M is 1 or 2; 

W intends a terminal deoxyribosyl-3* group, or a DNA sequence which by itself or, when n 1 is other 
40 than 0, in combination with the nucleotides to which it is joined, W defines a restriction site, having either a 
cohesive end or butt end, wherein W may have from 0 to about 20 nucleotides in the longest chain; 
the remaining symbols having been defined previously. 
Of particular interest is the following construct: 

45 ( Tr ) a -L- ( R-S ) r „ - ( GAXYCX ) n „GA | AGCT ! 

wherein: 

all of the symbols previously defined have the same definition; 
so a is 0 or 1 intending that the construct may or may not have the transcriptional and translational signals; 
the nucleotides indicated in the broken box are intended not to be present but to be capable of addition 
by adding an Hindlll cleaved terminus to provide for the recreation of the sequence encoding for a 
dipeptide; and 

n" will be 0 to 2, where at least one of the Xs and Ys defines a nucleotide, so that the sequence in the 
55 parenthesis is other than the sequence GAAGCT. 

The coding sequence of Gene* may be joined to the terminal T, providing that the coding sequence is 
in frame with the initiation codon and upon processing the first amino acid will be the correct amino acid for 
the mature polypeptide. 
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The 3'-terminus of Gene* can be manipulated much more easily and, therefore, it is desirable to provide 
a construct which allows for insertion of Gene" into a unique restriction site in the construct. Such a 
construct would provide for a restriction site with insertion of the Gene* into the restriction site to be in 
frame with the initiation codon. Such a construction can be symbolized as follows: 

5 

(TOa-MR-SV^GAXYOOn-W^SCJb-Te 
wherein: 

those symbols previously defined have the same definition; 
w SC are stop codons; 

Te is a termination sequence balanced with the promoter Tr t and may include other signals, e.g. 
polyadenylation; and 

b is an integer which will generally vary from about 0 to 4, more usually from 0 to 3. it being 
understood, that Gene* may include its own stop codons. 
rs Illustrative of a sequence having the above formula is where W is the sequence GA and n" is 2. 

Of particular interest is where the sequence encoding the terminal dipeptide is taken together with W to 
define a linker or connector, which allows for recreation of the terminal sequence defining the dipeptide of 
the processing signal and encodes for the initial amino acids of Gene*, so that the codons are in frame with 
the initiation codon of the leader. The linker provides for a staggered or butt ended termination, desirably 

20 defining a restriction site in conjunction with the successive sequences of the Gene*. Upon ligation of the 
linker with Gene*, the codons of Gene* will be in frame with the initiation codon of the leader. In this manner, 
one can employ a synthetic sequence which may be joined to a restriction site in the processing signal 
sequence to recreate the processing signal, while providing the initial bases of the Gene" encoding for the 
N-terminal amino acids. By employing a synthetic sequence, the synthetic linker can be a tailored 

25 connector having a convenient restriction site near the 3'-terminus and the synthetic connector will then 
provide for the necessary codons for the 5*-terminus of the gene. 

Alternatively, one could introduce a restriction endonuclease recognition site downstream from the 
processing signal to allow for cleavage and removal of superfluous bases to provide for ligation of the Gene* 
to the processing signal in frame with the initiation codon. Thus the first codon would encode for the N- 

30 terminal amino acid of the polypeptide. Where T is the first base of Gene*, one could introduce a restriction 
site where the recognition sequence is downstream from the cleavage site. For example, a Sau3A 
recognition sequence could be introduced immediately after the processing signal, which would allovTfor 
cleavage and linking of the Gene* with its initial codon in frame with the leader initiation codon. With 
restriction endonucleases which have the recognition sequence distal and downstream from the cleavage 

35 site e.g. Hgal, W could define such sequence which could include a portion of the processing signal 
sequences. Other constructions can also be employed, employing such techniques as primer repair and in 
vitro mutagenesis to provide for the convenient insertion of Gene* into the construct by introducing an 
appropriate restriction site. 

The construct provides a portable sequence for insertion into vectors, which provide the desired 

40 replication system. As already indicated, in some instances, it may be desirable to replace the wild type 
promoter associated with the leader sequence with a different promoter. In yeast, promoters involved with 
enzymes in the glycolytic pathway can provide for high rates of transcription. These promoters are 
associated with such enzymes as phosphoglucoisomerase, phosphofructokinase. phosphotriose isomerase, 
phosphoglucomutase, enolase, pyruvic kinase, glyceraldehyde-3-phosphate dehydrogenase, and alcohol 

45 dehydrogenase. These promoters may be inserted upstream from the leader sequence. The SMIanking 
region to the leader sequence may be retained or replaced with the 3*-sequence of the alternative promoter. 
Vectors can be prepared and have been reported which include promoters having convenient restriction 
sites downstream from the promoter for insertion of such constructs as described above. 

The final construct will be an episomal element capable of stable maintenance in a host, particularly a 

so fungal host such as yeast. The construct will include one or more replication systems, desirably two 
replication systems, allowing for maintenance in the expression host and cloning in a prokaryote. In 
addition, one or more markers for selection will be included, which will allow for selective pressure for 
maintenance of the episomal element in the host Furthermore, the episomal element may be a high or low 
copy number, the copy number generally ranging from about 1 to 200. With high copy number episomal 

55 elements, there will generally be at least 10, preferably at least 20, and usually not exceeding about 150. 
more usually not exceeding about 100 copy number. Depending upon the Gene*, either high or low copy 
numbers may be desirable, depending upon the effect of the episomal element on the host. Where the 
presence of the expression product of the episomal element may have a deleterious effect on the viability 
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of the host, a low copy number may be indicated. 

Various hosts may be employed, particularly mutants having desired properties. It should be appre- 
ciated that depending upon the rate of production of the expression product of the construct, the processing 
enzyme may or may not be adequate for processing at that level of production. Therefore, a mutant having 

5 enhanced production of the processing enzyme may be indicated or enhanced production of the enzyme 
may be provided by means of an episomal element. Generally, the production of the enzyme should be of 
a lower order than the production of the desired expression product. 

Where one is using a-f actor for secretion and processing, it would be appropriate to provide for 
enhanced production of the processing enzyme Dipeptidyl Amino Peptidase A, which appears to be the 

jo expression product of STE13 . This enzyme appears to be specific for X-Ala- and X-Pro-sequences. where X 
in this instance intends an amino acid, particularly, the dicarboxylic acid amino acids. 

Alternatively, there may be situations where intracellular processing is not desired. In this situation, it 
would be useful to have a ste13 mutant, where secretion occurs, but the product is not processed. In this 
manner, the product may be subsequentally processed in vitro. 

is Host mutants which provide for controlled regulationof expression may be employed to advantage. For 
example, with the constructions of the subject invention where a fused protein is expressed, the transfor- 
mants have slow growth which appears to be a result of toxicity of the fused protein. Thus, by inhibiting 
expression during growth, the host may be grown to high density before changing the conditions to 
permissive conditions for expression. 

20 A temperature-sensitive sir mutant may be employed to achieve regulated expression. Mutation in any 
of the SIR genes results in a non-mating phenotype due to in situ expression of the normally silent MATa 
and MATa sequences present at the HML and HMR loci. 

Furthermore, as already indicated, the Gene" may have a plurality of sequences in tandem, either the 
same or different sequences, with intervening processing signals. In this manner, the product may be 

25 processed in whole or in part, with the result that one will obtain the various sequences either by 
themselves or in tandem for subsequent processing. In many situations, it may be desirable to provide for 
different sequences, where each of the sequences is a subunit of a particular protein product. 

The Gene* may encode for any type of polypeptide of interest. The polypeptide may be as small as an 
oligopeptide of 8 amino acids or may be 100,000 daltons or higher. Usually, single chains will be less than 

30 about 300,000 daltons, more usually less than about 150,000 dattons. Of particular interest are polypeptides 
of from about 5,000 to 150,000 daltons, more particularly of about 5,000 to 100,000 daltons. Illustrative 
polypeptides of interest include hormones and factors, such as growth hormone, somatomedins epidermal 
growth factor, the endocrine secretions, such as luteinizing hormone, thyroid stimulating hormone, oxytocin, 
insulin, vasopressin, renin, calcitonin, follicle stimulating hormone, prolactin, etc.; hematopoietic factors, e.g. 

35 erythropoietin, colony stimulating factor, etc.; lymphokines; globins; globulins, e.g. immunoglobulins; al- 
bumins; interferons, such as a, p and y ; repressors; enzymes; endorphins e.g. jS-endorphin, enkephalin, 
dynorphin, etc. 

Having prepared the episomal elements containing the constructs of this invention, one may then 
introduce such element into an appropriate host. The manner of introduction is conventional, there being a 
40 wide variety of ways to introduce DNA into a host. Conveniently, spheroplasts are prepared employing the 
procedure of, for example. Hinnen et al., PNAS USA (1978) 75:1919-1933 or Stinchcomb et al., EP 0 045 
573 A2. The transform ants may then be grown in an appropriate nutrient medium and whereappropriate, 
maintaining selective pressure on the transformants. Where expression is inducible, one can allow for 
growth of the yeast to high density and then induce expression. In those situations, where a substantial 
45 proportion of the product may be retained in the periplasmic space, one can release the product by treating 
the yeast cells with an enzyme such as zymolase or lyticase. 

The product may be harvested by any convenient means, purifying the protein by chromatography, 
electrophoresis, dialysis, solvent-solvent extraction, etc. 

In accordance with the subject invention, one can provide for secretion of a wide variety of polypep- 
50 tides, so as to greatly enhance product yield, simplify purification, minimize degradation of the desired 
product, and simplify processing, equipment, and engineering requirements. Furthermore, utilization of 
nutrients based on productivity can be greatly enhanced, so that more economical and more efficient 
production of polypeptides may be achieved. Also, the use of yeast has many advantages in avoiding 
enterotoxins, which may be present with prokaryotes, and employing known techniques, which have been 
55 developed for yeast over long periods of time, which techniques include isolation of yeast products. 
The following examples are offered by way of illustration and not by way of limitation. 

EXPERIMENTAL 
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A synthetic sequence for human epidermal growth factor (EGF) based on the amino acid sequence of 
EGF reported by H. Gregory and B.M. Preston Int. J. Peptide Protein Res. 9, 107-118 (1977) was prepared, 
which had the following sequence. 

5 ' AACTCCGACTCCGAATGTCCATTGTCCCACGACGGTTACTGTTTGCACGACGGT6TTTCT 
3 1 TTGAGGCTGAGGCTTACAGGTAACAGGGTGCTGCCMTGACAMCGTGCTGCCACAMCA 

ATGTACATCGAAGCTnGGACAAGTACGCTTGTAACTGTGTTGTTGGTTACATCGGTGAA 
TACATGTAGCTTCGAAACCTGTTCATGCGAACATTGACACAACAACCAATGTAGCCACTT 

AGATGTCAATACAGAGACTTGAAGTGGTGGGAATTGAGATGA 
TCTACAGTTATGTCTCTGAACTTCACCACCCTTAACTCTACT , 



where 5* indicates the promoter proximal end of the sequence. The sequence was inserted into the EcoRI 

site of pBR328 to produce a plasmid p328EGF-1 and cloned. 

Approximately 30ug of p328EGF-1 was digested with EcoRI and approximately 1ug of the expected 
190 base pair EcoRI fragment was isolated. This was followed by digestion with the restriction enzyme 
Hgal. Two synthetic oligonucleotide connectors Hindlll-Hgal and Hgal-Sall were then ligated to the 159 base 
pair Hgal fragment The Hgal-Hindlll linker had the following sequenced 

AGCTGAAGCT 

CTTCGATTGAG 

This linker restores the a-factor processing signals interrupted by the Hindlll digestion and joins the Hgal 

end at the 5*-end of the EGF gene to the Hindlll end of pAB1 12. 

The Hgal-Sall linker had the following~siquence: 



TGAGATGATAAG 

ACTATTCAGCT 



This linker has two stop codons and joins the Hgal end at the 3*-end of the EGF gene to the Sail end of 
pAB112. — 

The resulting 181 base pair fragment was purified by preparative gel electrophoresis and ligated to 
lOOng of pAB112 which had been previously completely digested with the enzymes Hindlll and Sail. 
Surprisingly, a deletion occurred where the codon for the 3rd and 4th amino acids of EGF^asp andler, 
were deleted, with the remainder of the EGF being retained. 

pAB112 is a plasmid containing a 1.75kb Eco RI fragment with the yeast a-factor gene cloned in the 
EcoRI site of pBR322 in which the Hindlll and Sail sites had been deleted (pAB1l). pAB112 was derived 
from plasmid pAB101 which contains the yeast a-factor gene as a partial Sau3A fragment cloned in the 
BamHI site of plasmid YEp24. pAB101 was obtained by screening a yeast genomic library in YEp24 using a 
synthetic 20-mer oligonucleotide probe (S'-GGCCGGTTGGTTACATGATT-S') homologous to the published 
a-factor coding region (Kurjan and Herskowitz, Abstracts 1981 Cold Spring Harbor meeting on the Molecular 
Biology of Yeasts, page 242). 

The resulting mixture was used to transform E. coli HB101 cells and plasmid pAB201 obtained. Plasmid 
pAB201 (5ug) was digested to completion with the enzyme EcoRI and the resulting fragments were: a) 
filled in with DNA polymerase I Klenow fragment; b) ligated to an excess of BamHI linkers; and c) digested 
with Bam HI. The 1.75kbp EcoRI fragment was isolated by preparative gel electrophoresis and approxi- 
mately lOOng of the fragment was ligated to 100ng of pC1/1, which had been previously digested to 
completion with the restriction enzyme Bam HI and treated with alkaline phosphatase. 

Plasmid pC1/1 is a derivative of pJDB219, Beggs, Nature (1978) 275:104, in which the region 
corresponding to bacterial plasmid pMB9 in pJDB219 has been replaced by pBR322 in pCl/1. This mixture 
was used to transform E. coli HB101 cells. Transform ants were selected by ampicillin resistance and their 
piasmids analyzed by restriction endonucleases. DNA from one selected clone (pYEGF-8) was prepared 
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and used to transform yeast AB103 cells. Transformants were selected by their leu* phenotype. 

Fifty milliliter cultures of yeast strain AB103 (a, pep 4-3, leu 2-3, leu 2-112, ura 3-52, his 4-580) 
transformed with plasmid pYEGF-8 (deposited at the American Ty^CTjlTure~^llectibn"on 5th~January 1983 
and given ATCC Accession no. 20658) were grown at 30* in -leu medium to saturation (optical density at 

5 600nm of 5) and left shaking at 30" for an additional 12 hr period. Cell supernatants were collected by 
centrifugation and analyzed for the presence of human EGF using the fibroblast receptor competition 
binding assay. The assay of EGF is based on the ability of both mouse and human EGF to compete with 
125 l-labeled mouse EGF for binding sites on human foreskin fibroblasts. Standard curves can be obtained 
by measuring the effects of increasing quantities of EGF on the binding of a standard amount of 125 l-labeled 

w mouse EGF. Under these conditions 2 to 20 ng of EGF are readily measurable. Details on the binding of 
125 l-labeled epidermal growth factor to human fibroblasts have been described by Carpenter et ai. ( J. Biol. 
Chem . 250 , 4297 (1975). Using this assay it is found that the culture medium contains 7±1mg of human 
EGF per liter. 

For further characterization, human EGF present in the supernatant was purified by absorption to the 

/5 ion-exchange resin Biorex-70 and elution with HCI 10mM in 80% ethanol. After evaporation of the HCI and 
ethanol the EGF was soiubilized in water. This material migrates as a single major protein of MW approx. 
6,000 in 17.5% SDS gels, roughly the same as authentic mouse EGF (MW-6,000). This indicates that the 
a-factor leader sequence has been properly excised during the secretion process. Analysis by high 
resolution liquid chromatography (microbondapak C18, Waters column) indicates that the product migrates 

20 with a retention time similar to an authentic mouse EGF standard. However, protein sequencing by Edman 
degradation showed that the N-terminus retained the glu-ala sequence. 

A number of other constructions were prepared using different constructions for joining hEGF to the o- 
factor secretory leader sequence, providing for different processing signals and site mutagenesis. In Fig.~2 
a. through e. show the sequence of the fusions at the N-terminal region of hEGF, which sequence differ 

25 among several constructions, f. shows the sequences at the C-terminal region of hEGF, which is the same 
for all constructions. Synthetic oligonucleotide linkers used in these constructions are boxed. 

These fusions were made as follows. Construction (a) was made as described above. Construction (b) 
was made in a similar way except that linker 2 was used instead of linker 1 . Linker 2 modifies the a-factor 
processing signal by inserting an additional processing site (ser-!eu-asp-lys-arg) immediately preceding the 

30 hEGF gene. The resulting yeast plasmid is named pYoEGF-22. Construction (c), in which the dipeptidyl 
aminopeptidase maturation site (glu-ala) has been removed, was obtained by in vitro mutagenesis of 
construction (a). A Pstl-Sall fragment containing the a-factor leader-hEGF fusion was cloned in phage M13 
and isolated in a single-stranded form. A synthetic 31-mer of sequence 5 f - 
TCTTTGGATAAAAGAAACTCCGACTCCCG-y was synthesized and 70 picomoles were used as a primer 

35 for the synthesis of the second strand from 1 picomole of the above template by the Klenow fragment of 
DNA polymerase. After fill-in and ligation at 14* for 18 hrs, the mixture was treated with St nuclease (5 
units for 15 min) and used to transfect E. coli JM101 cells. Bacteriophage containing DNA sequences in 
which the region coding for (glu-ala) was removed were located by filter plaque hybridization using the ^P- 
iabeled primer as probe. RF DNA from positive plaques was isolated, digested with Pstl and Sail and the 

40 resulting fragment inserted in pAB114 which had been previously digested to completion with" Sail and 
partially with Pstl and treated with alkaline phosphatase. 

The plasmid pAB114 was derived as follows: 
plasmid pAB112 was digested to completion with Hindiil and then religated at low (4ug/ml) DNA 
concentration and plasmid pAB113 was obtained in which" three 63bp Hindlll fragments have been deleted 

45 from the a-factor structural gene, leaving only a single copy of mature ^factor coding region. A BamHI site 
was added to plasmid pAB1 1 by cleavage with EcoRI, filling in of the overhanging ends by theKlenow 
fragment of DNA polymerase, ligation of BamHI linkers, cleavage with BamHI and religation to obtain 
PAB12. Plasmid pAB113 was digested with EcoRI, the overhanging ends filled in, and ligated to BamHI 
linkers. After digestion with Bam HI the 1500bp fragment was gel-purified and ligated to pAB12 whicTThad 

so been digested with Bam HI and treated with alkaline phosphatase. Plasmid pAB114, which contains a 
1500bp Bam HI fragment carrying the a-factor gene, was obtained. The resulting plasmid (pAB114 contain- 
ing the above described construct) is then digested with Bam HI and ligated into plasmid pC1/1. 

The resulting yeast plasmid is named pYaEGF-23 and was deposited at the American Type Culture 
Collection on 12th August 1983 under ATCC Accession no. 40079. Construction (d), in which a new Kpnl 

55 site was generated, was made as described for construction (c) except that the 36-mer oligonucleotide 
primer of sequence S^GGGTACCTTTGGATAAAAGAAACTCCGACTCCGAAT-S' was used. The resulting 
yeast plasmid is named pYaEGF-24. Construction (e) was derived by digestion of the plasmid containing 
construction (d) with Kpn l and Sail instead of linker 1 and 2. The resulting yeast plasmid is named pYaEGF- 
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25. 

Yeast cells transformed with pYaEGF-22 were grown in 15 ml cultures. At the indicated densities or 
times, cultures were centrifuged and the supernatants saved and kept on ice. The cell pellets were washed 
in lysis buffer (0.1 Triton X-100. 10mM NaHPO* pH 7.5) and broken by vortexing (5min in 1min intervals 
with cooling on ice in between) in one volume of lysis buffer and one volume of glass beads. After 
centrifugation, the supernatants were collected and kept on ice. The amount of hEGF in the culture medium 
and cell extracts was measured using the fibroblast receptor binding competition assay. Standard curves 
were obtained by measuring the effects of increasing quantities of mouse EGF on the binding of a standard 
amount t25 l-labeled mouse EGF. 

Proteins were concentrated from the culture media by absorption on Bio-Rex 70 resin and elution with 
0.01 HCI in 80% ethanol and purified by high performance liquid chromatography (HPLC) on a reverse 
phase C18 column. The column was eluted at a flow rate of 4ml/min with a linear gradient of 5% to 80% 
acetonitrile containing 0.2% trifluoroacetic acid in 60min. Proteins (200-800 picomoles) were sequenced at 
the amino-terminal end by the Edman degradation method using a gas-phase protein sequencer Applied 
Biosystems model 470A. The normal PROTFA program was used for all the analyses. Dithiothreitol was 
added to S2 (ethyl acetate: 20mg/iiter) and S3 (butyl chloride: 10mg/liter) immediately before use. All 
samples were treated with 1N HCI in methanol at 40* for 15min to convert PTH-aspartic acid and PTH- 
glutamic acid to their methyl esters. All PTH-amino acid identifications were performed by reference to 
retention times on a IBM CN HPLC column using a known mixture of PTH-amino acids as standards. 

Secretion from pYaEGF-22 gave a 4:1 mole ratio of native N-terminus hEGF to glu-ala terminated 
hEGF, while secretion from pYoEGF-23-25 gave only native N-terminated hEGF. Yields of hEGF ranged 
from 5 to 8ug/ml measured either as protein or in a receptor binding assay. 

The strain JRY188 (MAT sir3-8 leu2 -3 Ieu2-112 trp1 ura3 his4 rme) was transformed with pYaEGF-21 
and leucine prototrophs selected at 37*. Saturated cultures were~therT diluted 1/100 in fresh medium and 
grown in leucine selective medium at permissive (24*) and non-permissive (36 *) temperatures and culture 
supernatants were assayed for the presence of hEGF as described above. The results are shown in the 
following table. 



Regulated synthesis and secretion of hEGF in transformed 
yeast sir3 temperature-sensitive mutants. 



Temperature Trans form ant O.D.650 hEGF(pg/ml) 



36° 3a 


3.5 


0.010 




5.4 


0.026 


3b 


3.6 


0.020 




6.4 


0.024 


24° 3a 


0.4 


34 




1.3 


145 




2.1 


1075 




4.0 


3250 


3b 


0.4 


32 




1.4 


210 




2.2 


1935 




4.2 


4600 



These results indicate that the hybrid a-factor/EGF gene is being expressed under mating type 
regulation, even though it is present on a high copy number plasmid. 

In accordance with the subject invention, novel constructs are provided which may be inserted into 
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vectors to provide for expression of polypeptides having an N-terminai leader sequence and one or more 
processing signals to provide for secretion of the polypeptide as well as processing to result in a mature 
polypeptide product free of superfluous amino acids. Thus, one can obtain a polypeptide having the 
identical sequence to a naturally occurring polypeptide. In addition, because the polypeptide can be 

5 produced in yeast, glycosylation can occur, so that products can be obtained which are identical to the 
naturally occurring products. Furthermore, because the product is secreted, greatly enhanced yields can be 
obtained based on cell population and processing and purification are greatly simplified. In addition, 
employing mutant hosts, expression can be regulated to be turned off or on, as desired. 

Although the foregoing invention has been described in some detail by way of illustration and example 

w for purposes of clarity of understanding, it will be obvious that certain changes and modifications may be 
practiced within the scope of the appended claims. 

Claims 

Claims for the following Contracting States : BE, CH, DE, DK, ES, FR, GB, GR, IT, U, LU, NL, SE 

15 

1. A DNA construct encoding a protein foreign to yeast, the amino acid sequence of said protein 
comprising at least a yeast alpha-factor leader sequence fragment that provides for secretion linked to 
a heterologous polypeptide sequence, said protein also containing yeast processing signals between 
said alpha-factor leader sequence fragment and said heterologous polypeptide for processing said 

20 protein into said heterologous polypeptide. 

2. A DNA construct according to Claim 1 further comprising a yeast promoter at the 5' end. 

3. A DNA construct according to any of Claims 1 - 2 wherein said heterologous polypeptide is a 
25 mammalian protein. 

4. A DNA construct comprising a sequence comprising the formula: 
^-Tr-L-Sp-Gene'-Te-a* 

30 

wherein: 

Tr is a yeast promoter sequence; 

L encodes at least a yeast alpha-factor leader sequence fragment that provides for secretion; 
Sp is a spacer sequence encoding processing signals for processing the precursor polypeptide 
35 encoded by L-Sp-Gene* into the polypeptide encoded by Gene*; 

Gene* encodes a polypeptide foreign to yeast; and Te is a transcription termination sequence 
balanced with Tr. 

5. The DNA construct of Claim 4 wherein Tr comprises a yeast alpha-factor promoter sequence. 

40 

6. The DNA construct according to either of Claims 4 - 5 wherein Sp contains the sequence S'-Ri-FVa' 
immediately adjacent to the sequence Gene*, Ri being a codon for lysine or arginine, R2 being a codon 
for arginine but does not encode a processing signal for dipeptidylaminopeptidase A. 

45 7. A DNA construct according to Claim 6 wherein Sp is 5'-Ri -R2-3\ 

a A DNA construct according to Claim 4 wherein Gene* encodes a mammalian protein. 

9. A DNA construct according to Claim 5 wherein Gene* encodes a mammalian protein. 

50 

10. A DNA construct according to Claim 6 wherein Gene* encodes a mammalian protein. 

11. A DNA construct according to Claim 8 wherein said mammalian protein is human epidermal growth 
factor. 

55 

12. A DNA construct according to Claim 9 wherein said mammalian protein is human epidermal growth 
factor. 
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13. A DNA construct according to Claim 10 wherein said mammalian protein is human epidermal growth 
factor. 

14. An episomal expression element comprising a DNA construct according to Claim 8 and a replication 
system providing stable maintenance in a yeast host. 

15. An episomal expression element comprising a DNA construct according to Claim 9 and a replication 
system providing stable maintenance in a yeast host. 

16. An episomal expression element comprising a DNA construct according to Claim 10 and a replication 
system providing stable maintenance in a yeast host. 

17. A method for producing a recombinant polypeptide in yeast and having said polypeptide secreted into 
the culture medium, said method comprising: 

providing a yeast host transformed by a DNA construct according to Claim 4; 

growing in said culture medium said transformed yeast under conditions whereby the precursor 
polypeptide encoded by 5 , -L-Sp-Gene w -3 t is expressed, at least partially processed into a polypeptide 
having the sequence encoded by Gene*, and secreted into said culture medium; 

and recovering from said culture medium said secreted polypeptide. 

1a A method according to Claim 17 wherein Tr comprises a yeast alpha-factor promoter sequence. 

19. A method according to either Claim 17 or 18 wherein Sp contains the sequence S'-Ri-Rz-S' imme- 
diately adjacent to the sequence Gene*, Ri being a codon for lysine or arginine, Rz being a codon for 
arginine, but does not encode a processing signal for dipeptidylaminopeptidase A. 

20. A method according to Claim 19 wherein S is 5 , -Ri-R2-3 t . 

21. A method according to any of Claims 17-20 wherein Gene* encodes a mammalian polypeptide. 

22. A method according to Claim 21 wherein said mammalian polypeptide is epidermal growth factor. 

23. A method according to any of Claims 17-20 wherein said yeast is strain AB 103, ATCC No. 20 658. 

24. Plasmid pYEGF8, ATCC Accession No. 20658. 

25. Plasmid pYoEGF 23., ATCC Accession No. 40079 

26. A method for producing a recombinant polypeptide comprising: 

providing a yeast host transformed by a DNA construct encoding a protein foreign to yeast, the 
amino acid sequence of said protein comprising at least a yeast alpha-factor leader sequence fragment 
that provides for secretion linked to a heterologous polypeptide sequence, said protein also containing 
yeast processing signals between said alpha-factor leader sequence fragment and said heterologous 
polypeptide for processing said protein into said heterologous polypeptide; 

growing in said culture medium said transformed yeast host under conditions whereby said protein 
foreign to yeast is expressed, at least partially processed into said heterologous polypeptide, and 
secreted into said culture medium; 

and recovering from said culture medium said secreted heterologous polypeptide. 

27. A method according to Claim 26 wherein said heterologous polypeptide is a mammalian protein. 

2a A method according to Claim 26 wherein said yeast alpha-factor is Saccharomyces alpha-factor. 

29. A method according to claim 26 wherein said heterologous polypeptide is processed irrtracellularty or 
extracellularly to provide a mature polypeptide. 

3a A method according to claim 29 wherein said mature polypeptide is selected from the group consisting 
of growth hormone, somatomedins, epidermal growth factor, insulin, renin, calcitonin, albumins and 
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interferons. 

31. A method according to any of claims 17, 19 or 21 wherein the polypeptide encoded by Gene* is 
processed intracellularly or extracellularly to provide a mature polypeptide. 

32. A method according to claim 31 wherein said mature polypeptide is selected from the group consisting 
of growth hormone, somatomedins, epidermal growth factor, insulin, renin, calcitonin, albumins and 
interferons. 

33. A method according to any of claims 26 to 32 wherein said yeast is strain AB 103, ATCC. No. 20658. 

34. A host cell transformed by a DNA construct according to any of claims 1 , 4, 6 or 7. 
Claims for the following Contracting State : AT 

1. A method for producing a recombinant polypeptide 
comprising: 

providing a yeast host transformed by a DNA construct encoding a protein foreign to yeast, the 
amino acid sequence of said protein comprising at least a yeast alpha-factor leader sequence fragment 
that provides for secretion linked to a heterologous polypeptide sequence, said protein also containing 
yeast processing signals between said alpha-factor leader sequence fragment and said heterologous 
polypeptide for processing said protein into said heterologous polypeptide; 

growing in said culture medium said transformed yeast host under conditions whereby said protein 
foreign to yeast is expressed, at least partially processed into said heterologous polypeptide, and 
secreted into said culture medium; 

and recovering from said culture medium said secreted heterologous polypeptide. 

2. A method according to Claim 1 wherein said heterologous polypeptide is a mammalian protein. 

3. A method according to any of Claims 1 - 2 wherein said DNA construct comprises a sequence 
comprising the formula: 

5'-Tr-L-Sp-Gene*-Te-3* 

wherein: 

- Tr is a yeast promoter sequence; 

- L encodes said yeast alpha-factor leader sequence fragment; 

- Sp is a spacer sequence encoding said yeast processing signals for processing the precursor 
polypeptide encoded by L-Sp-Gene* into the polypeptide encoded by Gene*; 

- Gene* encodes said heterologous polypeptide; and 

- Te is a transcription termination sequence balanced with Tr. 

4. A method according to Claim 3 wherein Sp contains the sequence 5*-Ri-R2-3* immediately adjacent to 
the sequence Gene*, Ri being a codon for lysine or arginine, R2 being a codon for arginine but does 
not encode a processing signal for dipeptidylaminopeptidase A. 

5. A method according to Claim 4 wherein Sp is 5'-Ri -R2-3V 

6. A method according to Claim 3 wherein Tr comprises a yeast alpha-factor promoter sequence. 

7. A method according to Claim 1 wherein said yeast alpha-factor is Saccharomyces alpha-factor. 

8. A method according to Claim 1 wherein said yeast alpha-factor is S. cerevisiae alpha-factor. 

9. A method according to Claim 2 wherein said mammalian protein is human epidermal growth factor. 

10. A method according to Claim 4 wherein Gene* encodes human epidermal growth factor. 
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11. A method according to Claim 5 wherein Gene* encodes human epidermal growth factor. 

12. A method according to Claim 7 wherein said heterologous polypeptide comprises human epidermal 
growth factor. 

13. A method according to Claim 8 wherein said heterologous polypeptide comprises human epidermal 
growth factor. 

14. A method according to any of claims 1, 7, or 8 wherein said heterologous polypeptide is processed 
intracellular^ or extracellularly to provide a mature polypeptide. 

15. A method according to claim 4 wherein said heterologous polypeptide is processed intracellular^ or 
extracellularly to provide a mature polypeptide. 

16. A method according to claim 15 wherein said mature polypeptide is selected from the group consisting 
of growth hormone, somatomedins, epidermal growth factor, insulin, renin, calcitonin, albumins and 
interferons. 

17. A method according to claim 16 wherein said mature polypeptide is selected from the group consisting 
of growth hormone, somatomedins, epidermal growth factor, insulin, renin, calcitonin, albumins and 
interferons. 

1a A method according to any of Claims 1, 7, 8 and 15 to 17 wherein said yeast is strain AB 103. ATCC 
No. 20 658. 

Revendications 

Revendlcatlons pour les Etats contractant sulvants : BE, CH, DE, DK, ES, FR, GB, GR, IT, LI, LU, NL, 
SE 

1- Construction d'ADN codant pour une proline etrangere a une levure, la sequence d'acides amines de 
cette proline comprenant au moins un fragment de sequence de tete de facteur alpha de levure qui 
pourvoit a la secretion lie a une sequence de polypeptide hSteVologue, cette proline contenant aussi 
des signaux de maturation molSculaire de levure entre ce fragment de sequence de tete de facteur 
alpha et ce polypeptide h&eVologue pour la maturation molecuiaire de cette proline dans ce 
polypeptide necrologue. 

2. Construction d'ADN suivant la revendication 1, comprenant de plus un promoteur de levure a 
rextnSmite* 5*. 

3. Construction d'ADN suivant la revendication 1 ou la revendication 2, dans laquelle ce polypeptide 
necrologue est une proline de mammifere. 

4. Construction d'ADN comprenant une sequence comprenant la formule : 

5'-Tr-L-Sp-Gene*-Te-3' 

dans laquelle : 

Tr est une sequence promoteur de levure; 

L code au moins pour un fragment de sequence de tete de facteur alpha de levure qui pourvoit 
a la secretion; 

Sp est une sequence espaceur codant pour des signaux de maturation molecuiaire pour la 
maturation molecuiaire du polypeptide pn§curseur code par L-Sp-Gene* en polypeptide code 
par Gene*; 

Gene* code pour un polypeptide etranger a une levure; et 
Te est une sequence de temninaison equilibria par Tr. 

5. Construction d'ADN suivant la revendication 4, dans laquelle Tr comprend une sequence promoteur de 
facteur alpha de levure. 
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6. Construction d'ADN suivant la revendication 4 ou la revendication 5. dans laquelle Sp contient la 
sequence 5'-Ri -R2-3' immediatement adjacente h la sequence Gene*, Ri etant un codon pour la lysine 
ou I'arginine. Ffe etant un codon pour I'arginine, mais qui ne code pas pour un signal de maturation 
moleculaire pour la dipeptidylaminopeptidase A. 

5 

7. Construction d'ADN suivant la revendication 6. dans laquelle Sp est 5'-Ri -R2-3'. 

a Construction d'ADN suivant la revendication 4, dans laquelle Gene* code pour une proline de 
m am m Here. 

9. Construction d'ADN suivant la revendication 5, dans laquelle Gene* code pour une protelne de 
mammifere. 

10. Construction d'ADN suivant la revendication 6, dans laquelle Gene* code pour une proline de 
75 mammifere. 

11. Construction d'ADN suivant la revendication 8, dans laquelle cette proline de mammifere est un 
facteur de croissance epidermique humain. 

20 12. Construction d'ADN suivant la revendication 9. dans laquelle cette proline de mammifere est un 
facteur de croissance e'pidermique humain. 

13. Construction d'ADN suivant la revendication 10, dans laquelle cette proline de mammifere est un 
facteur de croissance epidermique humain. 

25 

14. Element d'expression dpisomique comprenant une construction d'ADN suivant la revendication 8 et un 
systeme de replication pourvoyant & un maintien stable dans une Ievure-h6te. 

15. Element d'expression episomique comprenant une construction d'ADN suivant la revendication 9 et un 
30 systeme de replication pourvoyant & un maintien stable dans une levure-hote. 

16. Element d'expression episomial comprenant une construction d'ADN suivant la revendication 10 et un 
systeme de replication pourvoyant k un maintien stable dans une Ievure-h6te. 

35 17. Methode pour produire un polypeptide recombinant dans une levure et avoir une section de ce 
polypeptide dans le milieu de culture, cette methode comprenant 

la foumiture d'une levure-hote transformed par une construction d'ADN suivant la revendication 4; 
la croissance dans ce milieu de culture de cette levure transformee dans des conditions dans 
lesquelles le polypeptide precurseur cod6 par 5'-L-Sp-Gene*-3' est exprime\ soumis au moins partielle- 
40 ment h une maturation moleculaire en un polypeptide ayant la sequence cod£e par Gene* et secrete* 
dans ce milieu de culture; 

et la recuperation & partir de ce milieu de culture de ce polypeptide secrete. 

18. Methode suivant la revendication 17, dans laquelle Tr comprend une sequence promoteur de facteur 
45 alpha de levure. 

19. Methode suivant la revendication 17 ou la revendication 18. dans laquelle Sp contient la sequence 5'- 
R1-R2-3' immediatement adjacente & la sequence Gene*. R\ etant un codon pour la lysine ou I'arginine, 
R2 etant un codon pour 1'arginine, mais qui ne code pas pour un signal de maturation moleculaire pour 

so la dipeptidylaminopeptidase A. 

20. Methode suivant la revendication 19. dans laquelle Sp est 5'-Ri -R2-3'. 

21. Methode suivant I'une quelconque des revendications 17 & 20. dans laquelle Gene* code pour un 
55 polypeptide de mammifere. 

22. Methode suivant la revendication 21, dans laquelle ce polypeptide de mammifere est un facteur de 
croissance epidermique. 
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23. M6thode suivant Tune quelconque des revendications 17 a 20. dans laquelle cette levure est la souche 
AB 103, ATCC No. 20 658. 

24. Plasmide pYEGF8, No d'acces ATCC No. 20 658. 

5 

25. Plasmide pYaEGF23, No d'acces ATCC No. 40 079. 

26. Methods pour produire un polypeptide recombinant comprenant 

la fourniture d'une Ievure-h6te transformee par une construction d'ADN codant pour une proline 
io etrangere a la levure, la sequence d'acides amines de cette proline comprenant au moins un fragment 
de sequence de t§te de facteur alpha de levure qui pourvoit a la secretion lie a une sequence de 
polypeptide heterologue, cette proline contenant aussi des signaux de maturation moieculaire de 
levure entre ce fragment de sequence de tete de facteur alpha et ce polypeptide necrologue pour la 
maturation moieculaire de cette proline en ce polypeptide heterologue; 
'5 la croissance dans ce milieu de culture de cette Ievure-h6te transformed dans des conditions dans 

lesquelles cette proline etrangere a la levure est exprimee, soumise au moins partiellement a une 
maturation moieculaire en ce polypeptide necrologue et secnStee dans ce milieu de culture; 
et la recuperation a partir de ce milieu de culture de ce polypeptide necrologue secrete. 

20 27. M&hode suivant la revendication 26, dans laquelle ce polypeptide heCrologue est une proline de 
mammifere. 

28. Methode suivant la revendication 26, dans laquelle ce facteur alpha de levure est Ee facteur alpha de 
Saccharomyces. 

25 

29. Methode suivant la revendication 26, dans laquelle ce polypeptide heCrologue est soumis a une 
maturation moieculaire de fagon intracellular ou extracellulaire pour foumir un polypeptide mature. 

30. Methode suivant la revendication 29, dans laquelle ce polypeptide mature est choisi dans le groupe 
30 consistant en hormone de croissance, somatom§dines, facteur de croissance epidermique, insuline, 

rdnine, calcitonine, albumines et interferons. 

31. Methode suivant Tune quelconque des revendications 17, 19 ou 21, dans laquelle le polypeptide code" 
par Gene* est soumis a une maturation moieculaire de fagon intracellulars ou extracellulaire pour 

as fournir un polypeptide mature. 

32. Methode suivant la revendication 31, dans laquelle ce polypeptide mature est choisi dans le groupe 
consistant en hormone de croissance, somatomgdines, facteur de croissance epidermique, insuline, 
r€nine, calcitonine, albumines et interferons. 

40 

33. Methode suivant Tune quelconque des revendications 26 a 32, dans laquelle cette levure est la souche 
AB 103, ATCC No. 20 658. 

34. Cellule-h6te transformee par une construction d'ADN suivant Tune quelconque des revendications 1 , 4, 
45 6 ou 7. 

Revendications pour I'Etat contractant suivant : AT 

1. Methode pour produire un polypeptide recombinant comprenant: 

so la fourniture d'une Ievure-h6te transformee par une construction d'ADN codant pour une proteine 

etrangere a la levure, la sequence d'acides amines de cette proteine comprenant au moins un fragment 
de sequence de t§te de facteur alpha de levure qui pourvoit a la secretion lie a une sequence de 
polypeptide heterologue, cette proteine contenant aussi des signaux de maturation moieculaire de 
levure entre ce fragment de sequence de t§te de facteur alpha et ce polypeptide heterologue pour la 

55 maturation moieculaire de cette proteine en ce polypeptide heterologue; 

la croissance dans ce milieu de culture de cette levure-hote transformee dans des conditions dans 
lesquelles cette proteine etrangere a la levure est exprimee, soumise au moins partiellement a une 
maturation moieculaire en ce polypeptide heterologue et secnitee dans ce milieu de culture; 
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et la recuperation a partir de ce milieu de culture de ce polypeptide heCrologue secrete. 

2. Methode suivant la revendication 1, dans laquelle ce polypeptide heteVologue est une proteine de 
mammifere. 

3. Methode suivant la revendication 1 ou la revendication 2, dans laquelle cette construction d'ADN 
comprend une sequence comprenant la formule : 

S'-Tr-L-Sp-Gene'-TeKT 

dans laquelle : 

Tr est une sequence promoteur de levure; 

L code pour ce fragment de sequence de t§te de facteur alpha de levure; 

Sp est une sequence espaceur codant pour ces signaux de maturation moleculaire pour la 

maturation moleculaire du polypeptide precurseur cod§ par L-Sp-Gene* en polypeptide code 

par Gene'; 

Gene* code pour ce polypeptide necrologue; et 

Te est une sequence de terminaison equilibree par Tr. 

4. M6thode suivant la revendication 3, dans laquelle Sp contient la sequence S'-Ri-Ra-S 1 immSdiatement 
adjacente a la sequence Gene', Ri etant un codon pour la lysine ou Parginine, R2 6tant un codon pour 
Parginine, mais qui ne code pas pour un signal de maturation moleculaire pour la dipeptidylaminopepti- 
dase A. 

5. Methods suivant la revendication 4, dans laquelle Sp est 5'-Ri-R 2 -3\ 

6. Methode suivant la revendication 3, dans laquelle Tr comprend une sequence promoteur de facteur 
alpha de levure. 

7. Methode suivant la revendication 1, dans laquelle ce facteur alpha de levure est un facteur alpha de 
Saccharomyces: 

8. Methode suivant la revendication 1, dans laquelle ce facteur alpha de levure est un facteur alpha de S. 
cerevisiae. 

9. M6thode suivant la revendication 2, dans laquelle cette proteine de mammifere est un facteur de 
croissance 6pidermique humain. 

10. Methode suivant la revendication 4, dans laquelle Gene" code pour un facteur de croissance epidermi- 
que humain. 

11. Methode suivant la revendication 5, dans laquelle Gene* code pour un facteur de croissance 6pidermi- 
que humain. 

12. Methode suivant la revendication 7, dans laquelle ce polypeptide necrologue comprend un facteur de 
croissance epidermique humain. 

13. Methode suivant la revendication 8, dans laquelle ce polypeptide heCrologue comprend un facteur de 
croissance Epidermique humain. 

14. Methode suivant Tune quelconque des revendications 1, 7 ou 8, dans laquelle ce polypeptide 
necrologue est soumis a une maturation moleculaire de facon intracellulaire ou extracelluiaire pour 
foumir un polypeptide mature. 

15. Methode suivant la revendication 4, dans laquelle ce polypeptide heterologue est soumis a une 
maturation moleculaire de facon intracellulaire ou extracelluiaire pour fournir un polypeptide mature. 

16. Methode suivant la revendication 15, dans laquelle ce polypeptide mature est choisi dans le groupe 
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consistant en hormone de croissance, somatomddines, facteur de croissance 6pidermique, insuline, 
rSnine, calcitonins, albumines et interferons. 

17. M6thode suivant la revendication 16. dans laquelle ce polypeptide mature est choisi dans le groupe 
consistant en hormone de croissance, somatom^dines, facteur de croissance Spidermique, insuline, 
r€nine, calcitonine, albumines et interferons. 

18. Mdthode suivant Tune quelconque des revendications 1, 7, 8 et 15 h 17, dans laquelle cette levure est 
la souche AB 103, ATCC No. 20 658. 

PatentansprUche 

PatentansprUche fUr folgende Vertragsstaaten : BE, CH, DE, DK, ES, FR, GB, GR, IT, LI, LU, NL, SE 

1. DNA-Konstrukt, das ein zu Hefe fremdes Protein kodiert, wobei die Aminosauresequenz des Proteins 
zumindest ein die Sekretion ermoglichendes Hefe-a-Faktor-Leader-Sequenz-Fragment, verkndpft mit 
einer heterologen Polypeptidsequenz, umfaBt und das Protein zum Processing des Proteins in das 
heterologe Polypeptid weiterhin Hefe-Processing-Signale zwischen dem a-Faktor-Leader-Sequenz-Frag- 
ment und dem heterologen Polypeptid enthait. 

2. DNA-Konstrukt gemMfl Anspruch 1 , das weiterhin an dem 5'-Ende einen Hefe-Promotor umfaflt 

3. DNA-Konstrukt gemSB einem jeden der AnsprOche 1 bis 2, bei dem das heterologe Polypeptid ein 
Saugetierprotein ist. 

4. DNA-Konstrukt, das eine Sequenz umfaflt, die die Formel: 
5'-Tr-L«Sp-Gene*-Te-3' 

umfa/Jt. in der: 

Tr eine Hefe-Promotor-Sequenz ist; 

L zumindest ein die Sekretion ermoglichendes Hefe-a-Faktor-Leader-Sequenz-Fragment kodiert; 

Sp eine Spacersequenz ist, die Processing-Signale zum Processing des durch L-Sp-Gene* kodierten 
VorlSufer-Polypeptids in das durch Gene* kodierte Polypeptid kodiert; 

Gene* ein zu Hefe fremdes Polypeptid kodiert; und 

Te eine mit Tr balancierte Terminations-Sequenz ist 

5. DNA-Konstrukt nach Anspruch 4, bei dem Tr eine Hefe^a-Faktor-Promotor-Sequenz umfaBt 

6. DNA-Konstrukt gemaB einem der AnsprUche 4 bis 5, bei dem Sp die Sequenz S'-Ri-Ra-y unmittelbar 
neben der Sequenz Gene* enthSlt. wobei Ri ein Kodon tilt Lysin Oder Arginin und R2 ein Kodon fOr 
Arginin ist, aber nicht ein Processing-Signal fUr Dipeptidylaminopeptidase A kodiert. 

7. DNA-Konstrukt gemSB Anspruch 6, bei dem Sp 5 , -Ri-R 2 -3 f ist 

8. DNA-Konstrukt gemSfi Anspruch 4, bei dem Gene* ein Saugetierprotein kodiert 

9. DNA-Konstrukt gemSB Anspruch 5, bei dem Gene* ein Saugetierprotein kodiert 

10. DNA-Konstrukt gemafi Anspruch 6, bei dem Gene* ein Saugetierprotein kodiert 

11. DNA-Konstrukt gemafi Anspruch 8, bei dem das Saugetierprotein ein humaner epidermaler Wachs- 
tumsfaktor ist. 
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12. DNA-Konstrukt gemSfl Anspruch 9, bei dem das SSugetierprotein ein humaner epidermaler Wachs- 
tumsfaktor ist 

13. DNA-Konstrukt gemafl Anspnjch 10, bei dem das SSugetierprotein ein humaner epidermaler Wachs- 
5 tumsfaktor ist. 

14. Episomales Expressionselement, das ein DNA-Konstrukt gemafl Anspruch 8 und ein ein stabiles 
Erhaltenbleiben in einem Hefewirt ermoglichendes Replikationssytem umfaflt. 

70 15. Episomales Expressionsetement, das ein DNA-Konstrukt gemafl Anspruch 9 und ein ein stabiles 
Erhaltenbleiben in einem Hefewirt ermdglichendes Replikationssytem umfaflt 

16. Episomales Expressionselement. das ein DNA-Konstrukt gemafl Anspruch 10 und ein ein stabiles 
Erhaltenbleiben in einem Hefewirt ermdglichendes Replikationssytem umfaflt. 

75 

17. Verfahren zur Herstellung eines rekombinanten Polypeptids in Hefe und zum Erhalt der Sekretion des 
Polypeptids in das Kulturmedium, bei dem: 

ein mit einem DNA-Konstrukt gemafl Anspruch 4 transformierter Hefewirt bereitgesteilt wird; 
20 die transform ierte Hefe in dem Kultumnedium unter Bedingungen gezUchtet wird, bei denen das durch 
S'-L-Sp-Gene*^ kodierte VorlMufer-Polypeptid exprimiert, mindestens teilweise in ein die von Gene* 
kodierte Sequenz aufweisendes Polypeptid weiterverarbeitet und in das Kulturmedium sekretiert wird; 

und das sekretierte Polypeptid aus dem Kulturmedium wiedergewonnen wird. 

25 

18. Verfahren gemafl Anspruch 17, bei dem Tr eine Hefe-a-Faktor-Promotor-Sequenz umfaflt. 

19. Verfahren gemafl einem der AnsprUche 17 Oder 18, bei dem Sp die Sequenz 5 , -Ri-R 2 -3 f unmittelbar 
neben der Sequenz Gene* enthSIt, wobei Ri ein Kodon fUr Lysin Oder Arginin und R2 ein Kodon fur 

30 Arginin ist, aber nicht ein Processing-Signal fUr Dipeptidylaminopeptidase A kodiert 

20. Verfahren gemafl Anspruch 19, bei dem S 5'-Ri-R2-3* ist 

21. Verfahren gemafl einem jeden der AnsprUche 17 bis 20, bei dem Gene* ein Saugetierpolypeptid 
as kodiert. 

22. Verfahren gemafl Anspruch 21 , bei dem das Saugetierpolypeptid ein epidermaler Wachstumsfaktor ist. 

2a Verfahren nach einem jeden der AnsprUche 17 bis 20, bei dem die Hefe der Stamm AB 103, ATCC-Nr. 
40 20 658, ist 

24. Plasmid pYEGF8, ATCC-Zugangs-Nr. 20658. 

25. Plasmid pYaEGF 23., ATCC-Zugangs-Nr. 40079. 

45 

26. Verfahren zur Herstellung eines rekombinanten Polypeptids, mit den Schritten: 

Bereitstellen eines Hefewirts, der mit einem DNA-Konstrukt, das ein zu Hefe fremdes Protein kodiert, 
transformiert ist, wobei die Aminosauresequenz des Proteins zumindest ein die Sekretion ermoglichen- 
50 des Hefe^a-Faktor-Leader-Sequenz-Fragment verknUpft mit einer heterologen Polypeptidsequenz um- 
faflt, und das Protein zum Processing des Proteins in das heterologe Polypeptid weiterhin Hefe- 
Processing-Signale zwischen dem a-Faktor-Leader-Sequenz-Fragment und dem heterologen Polypeptid 
enthalt; 

55 ZOchten des transformierten Hefewirts in dem Kulturmedium unter Bedingungen, bei denen das zu 
Hefe fremde Protein exprimiert, mindestens teilweise zu dem heterologen Polypeptid weiterverarbeitet 
und in das Kulturmedium sekretiert wird; 
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und ZurUckgewinnen des heterologen Polypeptids aus dem Kulturmedium. 
27. Verfahren gemSfl Anspruch 26, bei dem das heterologe Polypeptid ein Saugetierprotein ist. 
2a Verfahren gemSB Anspruch 26, bei dem der Hefe-a-Faktor der Saccharomyces-a-Faktor ist. 

29. Verfahren gemSfl Anspruch 26, bei dem das heterologe Polypeptid zum Erhalt eines reifen Polypeptids 
intrazellular Oder extrazellular weiterverarbeitet wird. 

30. Verfahren gemafl Anspruch 29, bei dem das reife Polypeptid aus der von dem Wachstumshormon, 
Somatomedinen, dem epidermalen Wachstumsfaktor, Insulin, Renin, Calcitonin, Albuminen und Interfe- 
ronen gebildeten Gruppe ausgewahlt ist 

31. Verfahren gemSfl einem jeden der AnsprUche 17, 19 oder 21, bei dem das von Gene* kodierte 
Polypeptid zum Erhalt eines reifen Polypeptids intrazellulSr oder extrazellular weiterverarbeitet wird. 

32. Verfahren gemSfl Anspruch 31, bei dem das reife Polypeptid aus der von dem Wachstumshormon, 
Somatomedinen, dem epidermalen Wachstumsfaktor, Insulin, Renin, Calcitonin, Albuminen und Interfe- 
ronen gebildeten Gruppe ausgewahlt ist. 

33. Verfahren gemSfl einem jeden der AnsprUche 26 bis 32, bei dem die Hefe der Stamm AB 103, ATCC- 
Nr. 20658, ist. 

34. Wirtszelle, die mit einem DNA-Konstrukt gemMfl einem jeden der AnsprUche 1, 4, 6 oder 7 transformiert 
ist. 

PatentansprUche fUr folgenden Vertragsstaat : AT 

1. Verfahren zur Herstellung eines rekombinanten Polypeptids, mrt den Schritten: 

Bereitstellen eines Hefewirts, der mit einem DNA-Konstrukt, das ein zu Hefe fremdes Protein kodiert, 
transformiert ist. wobei die AminosSuresequenz des Proteins zumindest ein die Sekretion ermSglichen- 
des Hefe-a-Faktor-Leader-Sequenz-Fragment verknUpft mit einer heterologen Polypeptidsequenz um- 
faflt, und das Protein zum Processing des Proteins in das heterologe Polypeptid weiterhin Hefe- 
Processing-Signale zwischen dem a-Faktor-Leader-Sequenz-Fragment und dem heterologen Polypeptid 
enthSIt; 

ZQchten des transformierten Hefewirts in dem Kulturmedium unter Bedingungen, bei denen das zu 
Hefe fremde Protein exprimiert, mindestens teilweise zu dem heterologen Polypeptid weiterverarbeitet 
und in das Kulturmedium sekretiert wird; 

und ZurUckgewinnen des heterologen Polypeptids aus dem Kulturmedium. 

2. Verfahren gemafl Anspruch 1, bei dem das heterologe Polypeptid ein Saugetierprotein ist. 

3. Verfahren gemafi einem jeden der AnsprUche 1 bis 2, bei dem das DNA-Konstrukt eine Sequenz 
umfaBt, die die Formel: 

S'-Tr-L-Sp-Gene'-Te-a 1 

umfaBt, in der: 

Tr eine Hefe-Promotor-Sequenz ist; 

L zumindest ein die Sekretion ermdglichendes Hefe-o-Faktor-Leader-Sequenz-Fragment kodiert; 

Sp eine Spacersequenz ist, die die Processing-Signale zum Processing des durch L-Sp-Gene* 
kodierten Vorlaufer-Polypeptids in das durch Gene* kodierte Polypeptid kodiert; 
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Gene* das heterologe Polypeptid kodiert; und 

Te eine mit Tr balancierte Terminations-Sequenz ist. 

4. Verfahren gemMfl Anspruch 3, bei dem Sp die Sequenz 5 , -R1-R2-3 , unmittelbar neben der Sequenz 
Gene" enthSIt, Ri ein Kodon fQr Lysin oder Arginin und R2 ein Kodon fUr Arginin ist, aber nicht ein 
Processing-Signal fUr Dipeptidyiaminopeptidase A kodiert. 

5. Verfahren gemSfi Anspruch 4, bei dem Sp 5'-Ri -R2-3* ist. 

6. Verfahren gemSfi Anspruch 3, bei dem Tr eine Hefe-a-Faktor-Promotor-Sequenz einschlieflt. 

7. Verfahren gemSfi Anspruch 1 f bei dem der Hefe-a-Faktor ein Saccharomyces-o-Faktor ist. 

8. Verfahren gemSfl Anspruch 1, bei dem der Hefe-a-Faktor der S. cerevisiae-a-Faktor ist. 

9. Verfahren gemSB Anspruch 2, bei dem das SSugetierprotein ein humaner epidermaler Wachstumsfak- 
tor ist. 

10. Verfahren gemSB Anspruch 4, bei dem Gene" einen humanen epidermalen Wachstumsfaktor kodiert. 

11. Verfahren gemSB Anspruch 5, bei dem Gene* einen humanen epidermalen Wachstumsfaktor kodiert. 

12. Verfahren gemSB Anspruch 7, bei dem das heterologe Polypeptid einen humanen epidermalen 
Wachstumsfaktor einschlieBt. 

13. Verfahren gemafl Anspruch 8, bei dem das heterologe Polypeptid einen humanen epidermalen 
Wachstumsfaktor einschlieflt. 

14. Verfahren gemSfl einem jeden der Anspriiche 1, 7 oder 8. bei dem das heterologe Polypeptid zum 
Erhalt eines reifen Polypeptids intrazellular oder extrazellular weiterverarbeitet wird. 

15. Verfahren gemSfl Anspruch 4, bei dem das heterologe Polypeptid zum Erhalt eines reifen Polypeptids 
intrazellular oder extrazellular weiterverarbeitet wird. 

16. Verfahren gemafi Anspruch 15, bei dem das reife Polypeptid aus der von dem Wachstumshormon, 
Somatomedinen, dem epidermalen Wachstumsfaktor, Insulin, Renin. Calcitonin, Albuminen und Interfe- 
ronen gebildeten Gruppe ausgewahlt ist. 

17. Verfahren gemSfl Anspruch 16, bei dem das reife Polypeptid aus von dem Wachstumshormon, 
Somatomedinen, dem epidermalen Wachstumsfaktor, Insulin, Renin, Calcitonin, Albuminen und Interfe- 
ronen gebildeten Gruppe ausgewMhft ist. 

18. Verfahren gemU einem jeden der AnsprUche 1. 7, 8 und 15 bis 17. bet dem die Hefe der Stamm AB 
103, ATCC-Nr. 20658, ist. 
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